CSB Home | Search | Table of Contents | General Information |
2007.06.07 -- This application has been archived. Local users, see a CSB staff member for help if you need to run this program in the Core.
MAXSCALE is a FORTRAN program for local scaling one dataset (Fhkl's) to another. Usually this is used for scaling derivative data to native. Only the "derivative" dataset is modified. For each reflexion in the derivative dataset, a sphere of reflexions in its local neighborhood of reciprocal space and the corresponding sphere of reflexions in the native dataset are used to determine the scale factor that will be applied only to that central derivative reflexion. The central derivative reflexion is always omitted from the local neighborhood sphere; If there is reason to believe that some of the surrounding reflexions may be correlated in amplitude with the central reflexion, then a small sphere surrounding it can also be omitted.
In scaling each reflexion, several internal passes can be performed. The first internal pass determines the local scale factor using a "median scale" algorithm, which is simply the scale factor that results in an equal number of reflexions with (Fnative > scale * Fderiv) and (scale * Fderiv > Fnative). This is only a first approximation to the scale factor, but it has the advantage of being insensitive to a few badly measured native or derivative reflexions. With this first local scale factor, the rms deviation between the native F's and scale * derivative F's can be calculated within the neighborhood sphere. In subsequent internal passes, native and derivative reflexion pairs are omitted from the local neighborhood sphere if their difference after applying the previous scale factor is greater than some user-selected multiple of this rms deviation. This is a way of reducing the impact of a few wildly deviant native or derivative reflexions on the local scale factor to be calculated in the current internal pass. For all internal passes after the first internal pass, the new local scale factor is calculated as the ratio of the sum of the native F's within the local sphere to the sum of the corresponding derivative F's in its local sphere. (It has been my experience that this estimate of the local scale is better than a least squares approach.)
The number of reflexions to be used in the local neighborhood sphere is user-set, and is generally chosen to minimize the cross R-factor. 100 reflexions is usually a good starting point. During each internal pass, the program will expand the sphere until this many native and derivative reflexion pairs are in the local neighborhood. If it can not expand the sphere to obtain this number of reflexions, it will issue a warning. This should not happen often - if it does, increase the parameters dealing with the "deltas" and recompile.
The derivative dataset can be either reduced (merged) F's, or unmerged raw F's. The latter is particularly useful when there is not sufficient redundancy within a derivative dataset to allow good scaling between frames (such as with scalepack.) In this case, the raw derivative F's can be local scaled to some reference dataset, such as a good (merged) native dataset, and then any redundant (symmetry-related) observations can be merged. If this approach is to be used, only one "orientation" of data (where only the rotation axis (phi) varies during collexion) should be local scaled at one time. Also, no more than 180 degrees of rotation should be local scaled at one time, so that no reflexions are duplicated. There is a delta_phi check in the program that can be used in this "raw data" mode, which only allows reflexions with phi angle differing from the central reflexion by less than some user-set rotation angle to be allowed in the local neighborhood. (This prevents reflexions measured at say phi=10 from being included in the neighborhood of a reflexion measured at phi=50 (if the delta phi limit is 40 or less). This mainly affects the low resolution reflexions. This number depends on the rate of decay of the crystal and the exposure time per degree oscillation. 10 to 20 degrees for delta_phi seems reasonable for most cases.)
Since the "native" dataset is almost always merged (ie, symmetry-equivalent or otherwise redundant measurements averaged to give the "unique" set), it is often useful to expand this data by Laue symmetry to fill reciprocal space before the derivative data is local scaled to it. To do this, a file containing the (real space) space group symmetry operations must be supplied. At present, the matrices must be entered numerically, as described below:
c...This file contains the (real space) space group symmetry operators c...as obtained from the crystallographic tables, one operation per c...line. If the symmetry operation is : c... a11 a12 a13 x TransX x' c... a21 a22 a23 * y + TransY = y' c... a31 a32 a33 z TransZ z' c...then the line would be c... a11 a12 a13 a21 a22 a23 a31 a32 a33 c...in free format (ie, separated by spaces.) c...(The translations are not used in this program.) c...(These files are very similar in format to the ones used by c...Tom Terwilliger's HASSP and HEAVY programs.) c...(The identity matrix should be included. Centering should be c...omitted; ie, the file for P2 can be used for C2 also.)
For example, for space group P6(1), the symmetry file would be:
1 0 0 0 1 0 0 0 1 0 0 0 0 -1 0 1 -1 0 0 0 1 0 0 4 -1 1 0 -1 0 0 0 0 1 0 0 8 -1 0 0 0 -1 0 0 0 1 0 0 6 0 1 0 -1 1 0 0 0 1 0 0 10 1 -1 0 1 0 0 0 0 1 0 0 2
MAXSCALE can be run interactively. All input is prompted for and is free-format.
The executable is
/srv/local/r4k/maxscale
which should be in your path on an SGI.
The first three questions ask for minimum and maximum limits on H K and L in either of the datasets. These numbers don't have to be exact, but all H K L in either dataset should be within these ranges. If the ranges are large, the parameter MAXNHKLS might have to be increased and the program recompiled; The program will indicate if this has to be done.
The following is a sample input file for MAXSCALE (for VMS systems):
$r [rould.maxscale]maxscale -65 65 -23 23 -36 36 [.data]eng940420a.prot (3i4,f8.4) Y [rould.symops]P2-b.sym [.data]eng940609a.prot (3I4,F8.4,8X,2F6.2) Y j:eng940609a-scaled-0420.prot (3i4,f8.4,8x,f6.2) 131.2 45.5 72.9 90 119 90 C 100 0 N 2 3
Disclaimer : This is a completely re-written version of my old DSCALEAD program. This new version has been tested in a number of ways and has been used successfully for a few projects. There is always the possibility of bugs still residing in the program. Please let me know if one crosses your path.
Mark Rould (ROULD@pabo1.mit.edu)
This program is freely distributed and may be passed on freely; The curse of the scorpion upon anyone who tries to sell it or any part of it.
CSB Home | Search | Table of Contents | General Information |
Center for Structural Biology (www.csb.yale.edu),
Yale University (www.yale.edu)
Contact: webadmin(at)mail^csb^yale^edu Last Modified: Thursday, 07-Jun-2007 11:02:05 EDT by P. Fleming |