Tassel 4 Change History
(V4.3.15) July 30, 2015
(V4.3.14) March 26, 2015
- Added new enzymes to GBS Pipeline: AseI, AvaII, and KpnI-MspI
(V4.3.13) December 18, 2014
- Added -e ignore option for unspecified restriction enzyme. In this case barcodes and common adapter start sequences will be recognized, but chimeric DNA fragments (or partial digests) will not be trimmed.
- Added MslI. Fixed MspI common Y-adapters to match Poland et al. 2012. Added PstI-MspI-GDFcustom.
(V4.3.12) November 13, 2014
- Fixed a typo in the usage statement
- Added some new restriction enzymes
- Fix bug in GeneticMap that causes error when a marker present in the data being analyzed is not present in the GeneticMap
(V4.3.11) July 24, 2014
- Added new restriction enzyme KpnI
(V4.3.10) June 26, 2014
- added options to NamGwas to not resample and to set the resample percentage
- Added restriction enzyme NspI
- TAS-189 Handle negative values as double parameters for command line options
- Changed key file checking so that the minimal documented key file will work (Enzyme is not required).
- LibraryPrepID is now required (was not required in Tassel3).
(V4.3.9) June 5, 2014
- Added restriction enzymes NlaIII and SphI
(V4.3.8) May 15, 2014
- Modified clearVariants() and clearVariant() so that they work with ragged arrays
- Fixed getVariantOff() so that it works with ragged arrays
- Added ".topm" as an acceptable suffix for a binary TOPM in writeTOPM()
(V4.3.7) April 17, 2014
- Added BbvCI-MspI
- Added 5 new enzyme pairs
- Added restriction enzymes CviQI and Csp6I (isoschizomers of each other)
- parseSAMAlignment() now properly handles tags that don't align but are reverse complemented (i.e., flag = 20, or with 0x10 and 0x4 bits set)
- These now get reverse complemented before they are stored in the TOPM.
- Adapted parseSAMAlignment() to work properly with ragged arrays for variantOffsets and variantDefs
- Pointed runCompareGenosBetweenHapMapFilesPlugin() at DiscoverySNPCallerPlugin vs ProductionSNPCallerPlugin results for maize test data
- Fixed addVariant() and swap so that they work with ragged arrays for the variantOffsets and variantDefs
- Pointed runProductionSNPCallerPlugin() at the 20MB maize unit test data
- Fixed a bug: reportOnMissingSamples() moved outside a loop so that it is no longer called after every libraryPrepID, but for the whole set.
(V4.3.6) February 20, 2014
- [tassel:bugs] #173 Tweak to greatly improve mergeAlignment function -code change from Jason
- Fixed race condition with CombineDataSetsPlugin that could cause following plugins to run more than once.
- Corrected VCF File entry when Tassel uses major allele instead of reference.
- Removed extra ## from VCF output
(V4.3.5) January 16, 2014
- add new clustering methods
- added code to make haplotype allele assignments to parents based on unimpute
d data
- changed getUsage so that it displays as info instead of error using ?
- added line of code to change minMAF to 0.05 in clusters if it was set to -1<
/li>
- updated getUsage function and made it public
- add some functions to TOPMV3
- pointed runDiscoverySNPCallerPlugin() at test data files (to generate expect
ed results
- made changes so linkage marker imputation will work with latest imputed data
- Updating jar signatures
- Updated run_pipeline.pl and start_tassel.pl to work with older versions of perl
(V4.3.4) December 5, 2013
- add code to filter out hets before imputing
- Load key variables into memory for GBS V3 discovery SNP caller
- fixed minor bug in site score calculation which was comparing sites to themselves sometimes
(V4.3.3) October 31, 2013
- Resolved Tassel Issue #163 Adjust default filterAlignMinFreq from 0.1 to 0.0
- Fixed [tassel:bugs] #171 monomorphic loci - "N"s display in minor allele color
- This is coordinated with the first one. I kept getting errors because whatever search was returning the site # wasn't picking the first site at a position (and they'd all been renamed identically by FindMergeHaplotypesPlugin), and that led to array-out-of-bounds errors later on
- This sets the snpID in the haplotype file to be the same as in the source file. Otherwise it's renamed to S[chr]_[pos], which causes issues later on if there are multiple SNPs at the same location. Also, Clarifies the use of -hapSize
- improvements to code that prevent more than one file logger from being appended to the rootlogger
- reorganize and implement checksubpop so that it can be called as part of different imputation methods
- Load key variables into memory for GBS V3 discovery SNP caller
- add check subpops to clusteronly method
- GBSV3 discovery SNP caller, support HDF5
- delete markers with 0 maf from subpops in check subpops, fix bug in clusterOnly method
- Hypothesis prediction and best mapping selection
- made changes to callParentAllelesByClusterOnly so that it now calls haplotypes through sections of extreme segregation distortion as long as parameters are set properly
(V4.3.2) October 3, 2013
- Cached TBT in genetic mapping, save 35% time
- finished implementing StepwiseOLSModelFitterDialog
- implementing stepwise model fitting
- added method to check subpopulations
- implemented code for filtering sites by subpopulation
- modified site filter in checksubpops to deal with large heterozygous
segments effectively
- fixed bugs so that model fitter can generate a marker effect report.
currently works for markers as nested covariates but not markers as
nested factors
- bug fixes for genetic mapping hypothesis
- GWAS results: All chromosomes now go to a single HDF5 file.
- incorporated changes from Alex to implement BIC and mBIC model selection
- started new method for identifying haplotypes by clustering.
- modified reading of numeric markers to avoid excessive memory use
- corrected error causing clusterDistanceClusterDiffProportion() return
wrong value
- redefine unanimous haplotype to allow one offtype at a site
- add code to find haplotypes by clustering every window across the
genome.
- Added new enzyme combo HindIII-NlaIII to GBS Pipeline
(V4.3.1) September 5, 2013
- Continued Implementation of new Alignment Design
- bug fixes in AnnotateTOPM
- Fixed bug with FilterAlignment regarding Loci and Loci offsets
- Add new clustering package for clustering genotypes to identify common haplotypes
- Added new enzymes combo BamHI-MluCI and EcoRI to GBS Pipeline
- fix bug in setParameters that did not use the last option if it takes no parameter in CallParentAllelesPlugin
- optimize genetic mapping
- adds code to calculate blups, predicted values, and residuals
- Replace first em-dash with en-dash in Tassel Pipeline Arguments to accept em-dash and en-dash.
(V4.3.0) (GBS Pipeline Pass Initial Testing) (Requires Java 1.7) August 22, 2013
- Started Implementation of new Alignment Design
- Add BWA BLAST to TOPM
- added function to extract selection from an HDF5 file
- Continued implementation of Production Pipeline
- Added new enzymes HindIII, EcoRI-MspI, HindIII-MspI, and SexAI-Sau3AI
- Fixed [bugs:#170]. When using the command line, specifying the -ldType SiteByAll -ldTestSite [#] throws an error if the specified site number is < 1. Site numbering starts at 0, though, so it needs to be changed to if the site is < 0
- Add PETagsOnPhysicalMapV3, PE tags from both ends are truncated to 64 bp (with positions of full-length PE) list, support multiple hits.
Add PE and genetic mapping to HDF5 TOPM.
- Changed Nucleotide constants to return '0' when diploid value has '+'. For example A+, C+, G+, T+
- annotate TOPM from a folder in which there are slices of BLAST result
- CallParentAllelesPlugin allow processing of multiple alignments
(V4.2.1) August 1, 2013
- Corrected the reporting of time that the process (writing TagsByTaxa file) takes
- Improved usage statement for ModifyTBTHDF5Plugin
- add jni library for BLAS/LAPACK
- Production Pipeline improvements
- Renamed TagsToSNPByAlignmentPlugin to DiscoverySNPCallerPlugin
- Deleted RawReadsToHapMapPlugin.java and RawReadsToHapMapQuantPlugin.java
- renamed SeqToGenosPlugin to ProductionSNPCallerPlugin
- MutableHDF5Alignment using sites annotation caching, and introduction of site annotation object
- ProductionSNPCallerPlugin now writes reads per sample reports before trying to write HDF5 genotypes
- Add TagsOnPhysicalMapV3
- Updated TOPMSummaryPlugin to output variant refs correctly.
- Added an option in ExportUtils.writeToMutableHDF5(Alignment a, String newHDF5File, IdGroup exportTaxa, BOOLEAN KEEPDEPTH) to retain depth information in exported alignment. Set to false in all other methods that call this.
- Fixes bug with my maxVariants when want to expand variant number
- Fixed a bug: topm.getUniquePositions(chromosomes[i]) --> topm.getUniquePositions(i)
- add ejml-0.23.jar and remove ejml-0.13.jar
- Introduction of the Site package in pal.
(V4.2.0) (New Graphical Interface Look) July 18, 2013
- Fixed bug in FindMergeHaplotypesPlugin when creating haplotypes with small chromosomes
- Fixed bug when writing chunks to MutableNucleotideAlignmentHDF5
- Fixed bug where load file chooser not appearing in Java 7
- Improved caching in MutableNucleotideAlignmentHDF5
- Added -importGuess and -table options to Tassel Pipeline
- Initial GBS Production Pipeline added
- Better logging message when "Deleting Dataset"
- Improved javadoc for several classes
- Simplier implementation of LD Analysis that supports large sparse test designs
- Deleted -s parameter from FastqToTBTPlugin since not relatent for that plugin
- Increased default value of -s parameter in FastqToTagCountPlugin and QseqToTagCountPlugin from 200,000,000 to 300,000,000
- Fixed bug in FastqToTagCountPlugin
- classes implement a DoubleMatrix that uses BLAS and LAPACK for compute intensive operations and java code for simpler operations
- Corrected QQ and Manhatton plots to handle converting P Values from either Double or String to double. Issue #TAS-11 - QQ and Manhatton Plots Fail when reloading TableReport from File
(V4.1.34) June 27, 2013
- Fixed #165: Filter Alignment had opposite behavior for site names to remove.
- Changed Treatment types to this: Sets the LD Heterzygous Treatment Method. Type can be Haplotype (Default - For Inbred Lines), Homozygous (Uses only homozygous site - heterozygotes set to missing), or Genotype (Not Implemented Yet).
- Remove unused imputation approaches
- Added additional javadoc to GBS code.
- Fixed bug with getHomozygousNucleotideInstance()
- Renamed reference/alternate coding in GBS
- Read file names without multi-chromosome qualifiers
- Improved speed calculating LD for Inbreds
- Fixed importing of Phylip files
- Adding ByteHDF5 export as an option for GUI
- Fixed Numerical Genotype -> Separate Alleles Function
- Fixed key bug on not imputing heterozygous undercalls. Uncovered a RACE situation with HDF5
- Changed default preference whether to retain rare alleles to false (previously true).
- Improved handling of Tassel Preferences: Preference changes should be persisted when executing GUI and set only temporarily from Command Line Flags. Also getting preferences should use stored values when executing GUI. And should use default values (if not temporarily set) when executing from Command Line.
- Improved PrintHeapAction to report "Max available Heap" and to report all numbers in MB.
- Changed Data, Analysis, and Results Modes to have drop down menus at the top instead of buttons.
- Fix bug involving MinorWindowViterbiImputationPlugin when using chromosome mode
(V4.1.33) June 13, 2013
- New MutableBitNucleotideAlignmentHDF5
- New export to Mutable HDF5 Alignment
- Added variable transition option to ViterbiAlgorithmPlugin
- MinorWindowVirterbiImputation now counts the number of section donors and uses all of them
- New MutableNucleotideAlignmentWithDepth. Always holds depth for all 6 nucleotide alleles
- New readLine() utility that skips hash (#) lines that are comments.
- MinorWindowVirterbi produces a projection alignment for use with high density sequences. All fixed a few bugs with heterozgyous parents.
- Renamed HDF5 dataset from BASES to Genotypes. Clearer to outside biologist.
- New GBS Production Pipeline (SeqToGenosPlugin) uses MutableNucleotideDepthAlignment and stores depth
- Quantitative SNP calling (likelihood ratio method) added, Added counters for nBarcodedReads per fullSampleName and finalSampleName (libPrepID), Added reporting of goodBarcodedReads per full and final sample name
- added getEffectSize function to model effects that returns number of
columns in the design matrix
- added option for vcf likelihood-based calling of heterozygotes
- readVCF method is modified so that "./." genotypes will be read properly. depth are set to 0 for ./. genotypes.
- creates and implements SnpDataB73Ref class that compares all snps to B73
to determine genotype
- Improving memory efficiency of TOPMs, including ragged arrays for variants
- initialize bestSnp in findNextTerm with default SnpInfo
- MutableHDF5 Merge that should scale to entire genomes
- SeqToGenos now writes to a single MutableNucleotideDepthAlignment for all chromosomes
- Code to calculate detailed accuracy reports from 55k masked and randomly masked sites.
- Whole chromosome support for FindMergeHaplotypes
- Increased support for multiple chromosome imputation
- Mutable Alignment fixes bug with taxa with same name
- Whole genome imputation part 2
- Completed imputation with multiple chromosomes per alignment
- Fixed [bugs:#164] Bug in calculating het frequency in genome summary
- Fixes issue with some TOPM_HDF not being initialized correctly, and changes the deflation to faster setting
- Production pipeline, Now logs number of reads matched to TOPM (in addition to good, barcoded reads)
- Improved implementation and documentation of NucleotideAlignmentConstants.getNucleotideAlleleByte() utility method.
- Added functionality for masking hets, and added ability to use it through both the CLI (option -ldHetTreatment, with values "ignore", "missing", and "thirdstate") and the GUI. "Ignore" is the old way and is set as the default, including in methods I put in for backwards-compatability. Choosing "third state" calls a function that just throws an error for now; idea is for Peter to just plug his algorithm in there with minimal need to alter the rest of the pipeline. Talking with Ed, I used a MutableNucleotideAlignment to switch all the hets to missing in calculateBitLDForInbred; he says we should redo this by taking the input SBit alignment and just using AND operations on the genotypes to generate a mask, then apply it to switch all those genotypes to missing. I have no idea how to do this, so I'm passing it off to/requesting help from those more knowledgeable in it.
- Fixed readEndCutSiteRemnantLength (=4) for PstI-ApeKI
- Added BitAlignment.getHomozygousNucleotideInstance()
- add PETagsOnPhysicalMap
(V4.1.32) May 23, 2013
- Initial Implementation of new SNP Caller (SeqToGenosPlugin)
- Added utility read / write TOPM files (.topm, .topm.txt, .topm.bin, .topm.h5)
- Implementing and using TOPMInterface for compatibility with Tassel 3 TOPM and new HDF5 TOPM
- NucleotideImputationUtils. imputeUsingViterbiFiveState() emission matrix changing the values in two cases from 0.98 to 0.998, (each row should sum to 1).
- Modified WritePopulationAlignmentPlugin to handle cases where there are no monomorphic sites.
- Several modifications were made to writeToVCF method in the exportUtils: 1)For alignment object with not depth information, a proper vcf file will be created with no depth; 2)If ref allele is not set, the major allele is set as reference allele, this will be indicated in the header line; 3)If there is an unknown allele, the genotype will be set as unknown
- MinorWindowVirterbiImputation and FindMergeHaplotypesPlugin added
- Pair End Pipeline Added
- On Taxa Properties filter, Changed label to "Min Proportion of Sites Present" and made this calculate missing only if both alleles missing (no based on gametes).
(V4.1.31) May 16, 2013
- Added Min. Heterozygous limit to Filter Alignment by Taxa Properties Plugin
- Modified Table Report Display to fit row header width to data length
- Changed VCF import to accept files without read depth per allele(-AD tag). The depth will not be set.
- Regarding MergeDuplicateSNPsPlugin, REFERENCE_GENOME no longer needs to be present in the pedigree file. If a pedigree file is used that does not contain REFERENCE_GENOME, then REFERENCE_GENOME is not used when comparing duplicate SNPs.
(V4.1.30) May 9, 2013
- Added previousSetBit() to BitSet Interface
- Reading VCF files improved so that sites with more alleles than maximum can be read and top alleles will be kept
- Improvements to VCF support in Alignments
- A bug is corrected in writeVCF function in ExportUtils. When all reads are from a single alternative allele, the ExportUtil function would crash. This could happen when some taxa are filtered.
- Added methods to scope ordering of allele values. New methods are getAllelesByScope(), getAllelesScopeType(), getAllelePresenceForAllTaxaByScope(). And the scope types are Frequency, Depth, and Global.
- Added Algorithm to merge nearly identical sequences into long haplotypes
- CallParentAllelesPlugin add log messages, correct bug to set multipleBC correctly
- ViterbiAlgorithmPlugin implements the option to use recombination rates that vary depending on
position in the chromosome
- Added TransitionProbabilityWithVariableRecombination class that implements variable recombination rate
- A bug in the consensusCallsForVCF method of MergedIndenticalTaxaPlugin class is fixed.
- The maxMismatch and keepDuplicateAllele parameters is supported when do MergeDuplicateSNPsPlugin for VCF files.
- Better handling of -exportType by showing useful error message if type invalid.
- Added method to merge two Identifiers (taxa). Only highest most levels that are the same will be kept
- Added method to merge two loci. Start and end positions are min and max of original two.
- Modified ID Join Strict preference to have Strict (old true), NonStrict (old false), and NumLevels. Also added new preference to define number of levels to use with NumLevels mode.
- Corrected multiple bugs with the Merge Alignments Function
- Fixed bugs regarding ?s and rare alleles when loading polymorphism data.
- Added new method to Alignment interface to support multiple sites with same physical position and different SNP IDs. public int getSiteOfPhysicalPosition(int physicalPosition, Locus locus, String snpID);
- Added FilterTaxaPropertiesPlugin to filter Taxa from Alignment based on specified criteria.
(V4.1.29) April 25, 2013
- New version of JHDF5 12.02.3 (2013-01-03)
- Fixed bug with VCF Genotype Score Lookup Table
- For VCF Export, when reference allele is unknown, the major and minor alleles are recorded
- Added restriction enzyme Sbfl with Elshire common adapter to GBS Pipeline
- Corrected bug with Heterozygous Distance Calculation
- In TOPMSummaryPlugin, suppressed the out of range warning when the variant is Gap
- In TagsToSNPByAlignmentPlugin, Gap alleles are now properly added to the TOPM
- MergeDuplicateSNPsPlugin and MergeIdenticalTaxaPlugin now supports VCF file input.
- Exporting VCF files now correctly records depths over the maximum (127) as the maximum
- In TOPMSummaryPlugin, removed warnings for positions out of range as those aren't necessarily errors.
- Corrected bug setting reference alleles when importing VCF files
- Corrected bug importing and displaying Polymorphism Formatted Files
(V4.1.28) April 18, 2013
- Fix Bug when getting physical position from unknown locus returned -1. Now throws an exception.
- Added caching of variant definitions from TOPM HDF5.
- Added ability to update variant definitions in TOPM HDF5
- Optimized Site Name Filter to handle very large numbers of names (i.e. greater than 4 million)
- Improvements to SNP Caller
- Exporting Hapmap files can now include more than 2 values in alleles column
(V4.1.27) April 11, 2013
- Exporting VCF files now outputs genotypes in reference/alternate order instead of major/minor and only 3 alleles when all 3 are present
- For consistency with other GBS plugins, MergeDuplicateSNPsPlugin now uses -sC and -eC (start and end chromosomes)
- Fixed bug that occasionally occured when exporting HDF5 Alignments. The error message indicated that alignment not optimized for sites or taxa
- Faster implementation for exporting HDF5 Alignments
(V4.1.26) April 4, 2013
- Added new restriction enzymes Ndel and HinP1l (Elshire common adapters)
- SNP Caller if the reference option is invoked and the target chromosome is not found in the reference genome fasta file, no SNPs are called for that chromosome and a warning is written to the output indicating that the chromosome is being skipped.
- SNP Caller if the -ref option is invoked, SNPs where the reference allele is a gap (-) are now filtered from the output.
- SNP Caller for the -vcf option, the output file is named after the HapMap output file, with the .hmp.txt extension replaced with .vcf. If the output hapmap file name ends in .hmp.txt.gz, then the output vcf file is also compressed and will end with .vcf.gz.
- SNP Caller for consistency with other plugins, the starting and ending chromosomes are now specified using -sC and -eC (instead of -s and -e).
- SNP Caller must now specify the output HapMap file name for the -o option (not just the directory), using + as a wild card character in place of the chromosome number. If the filename ends with *.hmp.txt.gz, the output files (one per chr) will be compressed. Existence of the output directory is now tested BEFORE calling SNPs.
- Added option to Pipeline -separate flag to allow optional specification of chromosome(s) to separate. Nothing specified returns all chromosomes as before.
- Fixed occasional string index out of range error when adding reference tag in SNP Caller
- Added Kmer Plugins
- Fixed bug when importing VCF Files: allele depths were incorrectly being populated, unable to contain values above 15.
(V4.1.25) March 21, 2013
- Fixed Bug initializing myReference in MutableAlignment Constructor
- Integrated VCF Import Function into GUI and CLI (pipeline)
- Added Support for Importing / Exporting gzip VCF Files
- Added Support for Importing VCF Files
(V4.1.24) March 8, 2013
- Bug fix for MLM
- SNP Caller (TagsToSNPByAlignmentPlugin) now includes taxon "REFERENCE_GENOME" in HapMap and VCF files.
- Added support for Reference in Alignments
- Formatting correction to VCF Export
- Corrected problem with LD Display of Full Matrix and Sliding Window
(V4.1.23) February 28, 2013
- Code refactoring for improved code maintenance.
- Corrected minor problems with LD. 1) Sorts "Site List" if provided; 2) R, D', and P values >=0 are returned unchanged and less than zero are NaN; 3) Changed Sample Size and R to return correct value regardless if indices r < c or c < r.
- Added robustness checks to LD. 1) R, D Prime, and P values less than zero throw IllegalStateException; 2) IllegalStateException thrown if Sample Size less than minimun taxa for estimate when R, D Prime, or P values aren't Nan.
- Faster implementation of BitUtils.toPadString() method
(V4.1.22) February 14, 2013
- Exporting Table Reports now supports .gzip
- Corrected LD results, so that getRSqr(x,y) == getRSqr(y,x)
- Added code to remove progress bar after Chart Display finishes
- Added imputeLinkageMarkerAcrossFamilies
- Fixed bug related to reading agpv2 version of the map
- Corrected problem with Alignment getBits() to return long
- Change sequential method to use full model to calculate best snp F-test and p-value
- When removing unwanted SNP variants, changed result TOPM to keep all Tags (even if no variants defined).
(V4.1.21) January 30, 2013
- Added logging for Total Number of SNPs
- added maxsnps parameter. calculate p value for stopping from full model for test with residuals.
- Expansion of minor window approach to rapid inbred imputation
- Corrected problem with checking if variant defined.
- fixes output file parameter bug
- add option to choose next entry based on test using residuals from the previous model
- fix getPredicted bug, make getNumberOfLevels work better
- Method to get a subset of the longs backing a bitset using arraycopy
- changes to move bootstrap gwas to plugin
- Corrected problem when keeping variants on negative strand tags.
- Additional logging to show positions from tags with max variants defined. And corrected logging of SNPs per chromosome.
- Additional logging for writeBinaryWVariantsFile()
(V4.1.20) January 24, 2013
- Added more logging.
- Outputs HapMap files with all the heterozygous regions set to Unknown
- Addresses a situation with heterozygous parents in the simulator
- New TOPM Summary Plugin
- Made variantsDefined() public so that it can be used by the TOPMSummaryPlugin
- New plugin to keep only tags based on list(s) of sites in chromosomes.
- Added writeBinaryWVariantsFile(File outFile) to write only tags with variants to TOPM
- add modified classes from Tassel 3 for gwas
- New Plugin to Merge Multiple TOPMs assuming no overlaying changes from original
- Added getVariantOff() and getVariantDef()
- Changed this line to set base to UNKNOWN instead of X a.setBase(taxon, s + startBase, (byte) 'X'); a.setBase(taxon, s + startBase, Alignment.UNKNOWN_DIPLOID_ALLELE);
- Added additional encoding translations to NUCLEOTIDE_IUPAC_HASH. Also change undefined Nucleotide values to an UPDEFINED_ALLELE_STR = ""X""
- Added extra logging info. on error when writing Hapmap files.
(V4.1.19) January 10, 2013
- Updated some mappings for NUCLEOTIDE_IUPAC_HASH
- Fixes two bugs in TASSEL 3 to 4 migration for FastImputationBitFixedWindow
- added code to handle mulitple backcross families
- added function to impute a set of markers at specific genetic positions
- new class for NAM marker map that provides a means to translate between genetic and physical positions
- Increased basic number of tests to something reasonable (N=20)
- deleted, code no longer used
- Pipeline to estimate heterozygosity by regions. Just a preliminary stub
- Pipeline for designing hybrids with interesting centromeric regions
(V4.1.18) January 2, 2013
- Added look ahead caching. Seeing about 8% speed improvement. Tried many ideas to get this minimal improvement!
- Improved performance of major and minor allele calculations and using caching of values.
- Improved calculation of genetic distances to avoid constant tbit optimizing and caching values.
- Improved caching. Using LinkedHashMap instead of WeakHashMap.
- Implemented ""Remove minor allele"" for Site filter.
- Corrected message text in exception thrown.
- Changed Alignment Viewer Reference function to not highlight Allele values that are Unknown even if different from reference.
- Changed default Alignment Viewer color coding the Major/Minor Alleles
(V4.1.17) December 20, 2012
- Commented unused code.
- Committed code that's no ready and may not be used.
- Several improvements to the Alignment Viewer. Uses simplified Vertical column headers that works under Java 7. Also reference, genetic distance, nucleotide, major allele, minor allele, etc. color coding.
- Added VCF export options to pipeline and GUI
- Added line to append .vcf extension when exporting VCF files.
- Added VCF out to TagsToSNPByAlignmentPlugin.java
- modified tag filter by adding a new method
- Added MutableVCFAlignment.java
- Extracted whole genome GBS data for Zak's fine mapping RC-NILs along with a subset of the teo/W22 BC2S3 population
- Initial update. Still working on this.
- Changed getNucleotideIUPAC() to return ? when state is undefined to prevent errors when displaying values. This happens when rare alleles are converted to Z resulting for example in diploid value A:Z which has no IUPAC code.
(V4.1.16) December 13, 2012
- New Alignment Masks for reference and genetic distance
- New UI that creates vertical labels.
- Updated to work on MAC with Java 7
- Minor refactoring to clean up this class.
- Corrected behavior when selecting Alignment Masks that do not have a set color.
- Removed logging from optimizeForTaxa() and optimizeForSites() to reduce unneccessary log messages.
- Corrected toString() to not use Color if not set.
- Added etched border around Taxa Row headers. Mainly for look when running under Java 7
- remove bug introduced while debugging
- Added support to load HDF5 Alignment Files.
- Added support to load HDF5 Alignment files.
- HDF5 Alignment Implementation
(V4.1.15) December 6, 2012
- Added support to export Alignments to HDF5 format from either the GUI or command line.
- Changed Allele States in HDF5 format to handle a double dimension String array, so that it generically works with Nucleotide data and Text Data (has different mappings per site).
- Corrected problem with allele counts. Should be type long (not int)
- Corrected log message
- Added two getBufferedWriter() methods that allow for appending to file and input of File or String filename.
- Added new factory method public static FilterAlignment getInstance(Alignment a, Locus locus) to separate one locus.
- Added Alignment.getStartAndEndOfLocus()
- Improved writeToHDF5 format
- Removed unused line
- Added isNucleotideEncodings() method
- Added convenience constructor when only name is known. It will be used for name and chromosome.
- Added more constants
- Corrected calculation of end site
- Added the restriction enzyme Sau3AI
- Updated HDF5 format and used constants instead of hard coded strings.
- Added more constants
- Revert changes.
- New HDF5Constants Class
- Better handling of HDF5 filename
- Using constants instead of hard coded numbers.
- rename src/net/maizegenetics/pal/alignment/{SBitAlignmentNucleotideHDF5.java => BitAlignmentHDF5.java}
- Renamed SBitAlignmentNucleotideHDF5.java -> BitAlignmentHDF5.java
- Moved SBitAlignmentNucleotideHDF5.createFile() to ExportUtils.writeToHDF5()
(V4.1.14) November 29, 2012
- Adding support for phased Alignments
- Added serialVersionUID
- Added serialVersionUID
- MutableSingleEncodeAlignment.java update for VCF output
- skip adjacent 10 sites when calculating avg ld to allow for mismapped blocks
- add ability to provide logfile name as a parameter
- Continued migration to Tassel 4 and added SNP Logging
- Continued migration to Tassel 4
- debug computeGenotypeR
- Changed SNPLogging interface and corrected SNP logging for GBSHapMapFiltersPlugin
- Added getPhasedAllelePresence() methods for phased alignment support
- Better definition of ONE
(V4.1.13) November 15, 2012
- Renamed SBitPhasedAlignment to BitPhasedAlignment
- Continued Migration to Tassel 4
- Add setDepthForAllele()
- in callParentAllelesByWindow, set minMaf = 0 if minMaf < 0 and parent contribution is invalid
- begin implementation of phased imputation
- add function for phased Viterbi (empty for now)
- Added SNP removal logging
- Added usage for -snpLog
- Continued Migration to Tassel 4
- add options for calling parent alleles
- Heterozygosity profiler for association panels
- Beginning of annotating HDF5 TOPM
- TOPM HDF5 model
- More realistic synthetic simulation
- added segregation ratio and ld filters for snps
- replace forward slash in family name, change default for parameters
- add code for ld check and segregation ratio check
- Improved logging of SNPs removed.
- Updated to send log messages to console if no file defined.
- Start of logging removed SNPs
- Added SNPLogging class to log SNPs removed by GBS Plugins
- Added method getBufferedWriter() that allows appending to end of file.
- Now reports how many taxa read from the taxa list input file and how many matches found in the hapmap file
- taxaMerge and polyFilter for Zaks whole genome GBS genotypes
(V4.1.12) November 8, 2012
- Corrected overAllHomoErrorRate calculation.
- Using MutableAlignment.clearSiteForRemoval() instead of MutableAlignment.removeSite(). This prevents problems with offset site numbers. clearSiteForRemoval() stages site for removal when clean() is executed.
- Added clearSiteForRemoval(int site)
- Add logging for Output Filename\tOver All Homo Error Rate\tCoverage
- Continued migration to Tassel 4
- Corrected Constructor
- Corrected removeSite method
(V4.1.11) November 1, 2012
- tagLoci with the status noVarSitesInAlign were not being reported in the tagLocusLog. This is now fixed.
- LocusLog reporting now performed for all tagLoci
- Added Taxa Reporting
- Changed format of summary logging.
- set default of -1 for inbredCoef
- changed getTwoCluster to filter out low coverage taxa before clustering
- add het masker
- Added Mean and Median Calculations.
- The mnLCov is now a true minimum (>= comparison rather than >) and the refTag field in the tagLocusLog is populated even for rejected tags
- Now reports (to console) total number of taxa and sites in the two hapmap files -Jeff
- Added checks that (1) each input hapmap file name contains a ""+"" wildcard character (in place of the chr#), (2) each contain only one chromosome, (3) each contain the same chromosome, and (4) this is the expected chromosome matching the file name. -Jeff
- Corrected output of allele strings and removed use of string concatenation (+) when writing to file
- New JeffPipelines class for Tassel4 GBS
- Added a locusLogFile output which, so far, records info on all of the rejected TagLoci
- isSiteGood now checks if the site has been rendered invariant after setting genos with rare variants to missing
- Added isEqual methods for comparing diploid value that matches regardless of order.
- Added methods to get Nucleotide Complement
- modified hetMasker. added estimatedHetFraction as a parameter.
- added het masker function
(V4.1.10) October 24, 2012
- Implemented getMajorAllele() and getMinorAllele()
- Fixed bug when ""Capture Unselected"" Used.
- Implemented equals() method
- Corrected Allele Presence methods to translate taxa index and site index.
- Implemented getTotalNumAlleles()
- Added convenience utility methods to get distance matrix
- fixed bug in whichSnpsAreFromSameTag that could cause an error at the last site
- Provide better support for inbreds and solving regionally.
- Methods for tracking all the best donors
- Dealing with change in mutable alignment
(V4.1.9) October 18, 2012
- New forester.jar from Peter that's the older code that works with GBS but modifies to not exit Tassel when closed and doesn't break menus.
- Added setTaxonName to MutableAlignment Interface. And changed implementation in MutableSingleEncodeAlignment to only set. Use addTaxon to add.
- deleted, function moved to WriteAlignmentPlugin
- add code to write nucleotides instead of parent calls, add monomorphic sites
- improve calling conversion to nucleotides, minor bug fixes
- Methods for testing range of imputation parameters
- Refactoring to make code clearer
- Added expansion from focus block
- Semi-working version of imputations with known parents
- Some minor changes hoping to improve speed & removing some old, commented out code left over from tassel3 version
- When determining if fastq or qseq, now checks for ""qseq"" only in the file name, not in the full path
- Beginnings of class for known parent imputation
- Add setTaxonName to MutableAlignment
- Revert ""only output selected snps for each family""
- only output selected snps for each family
- jarsigned forester.jar
(V4.1.8) October 11, 2012
- Add code to center Archaeopteryx dialog
- Added support for SynonymizerPlugin (-synonymizer)
- Changed Synonimizer to support running non-interactively.
- Reverting to previous forester.jar to fix problems with biojava dependency. May update later.
- likelihoodRatioThreshAlleleCnt is now calculated based upon the expected sequencing error rate per base (errRate) provided by the user (default = 1%) - Added by Jeff
- Removed unused imports.
- Newer version of junit jar
- Added ArchaeopteryxPlugin to the Tassel GUI
- Corrected colapse to be collapse
(V4.1.7) October 4, 2012
- Supporting jars for Archaeopteryx plugin
- Added support for exporting ""Report"" via the GUI as either a ""Report"" or ""Text""
- Optimization to equals() method.
- Corrected problem setting Locus. Now using equal() to compare Loci instead of ==. Also corrected cleaning up unused Loci.
- Changes to make SNP calling work in Tassel4. Still some kinks to be worked out.
- add archeaopteryx plugin
- added archaeopteryx plugin initially commented out
- add new functionality
- added and debugged file consolidation utilities
- Added options to use the new DistanceMatrixRangesPlugin Plugin
- New Plugin to calculate distance matrix for given taxa at specified physical position ranges.
- Added utility to calculate distance given two taxa. public static double computeHetBitDistances(Alignment theTBA, int taxon1, int taxon2, int minSitesCompared, boolean isTrueIBS) {
- Added check for when start physical position and end physical position result in no sites.
- Minor refactoring.
- Added convenience method to create filtered alignment based on physical positions.
- Object to hold TOPM mapping information
- HDF5 version of a TOPM file
- Added method to pad short sequences with AAAAAAAA
- Made printRows public from protected
- add Archaeopteryx tree viewer plugin
(V4.1.6) September 27, 2012
- Corrected calculation of proportion of Heterozygous sites for taxa.
- Added public int getTotalNotMissingForTaxon(int taxon);
- Added removeUnusedLoci() to clean up unused Loci. Contributed by Jason W.
- Added Enzymes RBSTA and RBSCG.
- add log file, improved logging
- Continued refactoring with Jeff.
- Changed getPositionInLocus() and getSNPID() to use site number if not set.
- Continued refactoring of getSNPCallsQuant() with Jeff
- Added Allele constants for A, C, G, T, INSERT
- Refactoring getSNPCallsQuant() working w/ Jeff
- Removing TagsToSNPByAlignmentMTPlugin
- Improved error messages to user and fixed filtering by chromosome feature.
- Added utility method that separates an Alignment into FilterAlignments for each Loci.
- fixes allele count filter
- debugging
- Corrected getSiteOfPhysicalPosition() implementation
- corrected numerical genotype code for heterozygotes, fixes problem with svd not converging
(V4.1.5) September 19, 2012
- Added -filterAlignLocus to Pipeline to specify which Locus associated with start / end physical positions defined by FilterAlignmentPlugin
- Added getLocus(String) method to return Locus of matching String.
- Changed this to show message to use other software for imputation until we decide which if any algorithms to make available.
- Added options -convertToSiteOpt -convertToTaxaOpt -distanceMatrix -filterAlignStartPos -filterAlignEndPos
- Adding ProcessLoadBitAllelesTaxon to optimize for Taxa to prevent errant results due to race conditions. Before it was processing alignment one site at a time.
- Added ability to set start / end physical position from command line.
- New Plugin to Create Distance Matrices.
- added code to merge nonconsensus files
- Fixed but caused by using myMaxNumAlleles instead of get method.
- Added logging statements to show Tassel version and Max Memory reported by JVM when running Tassel GUI.
- implement use of pedigree file
- Added check for differ length sequences in fasta file for better error message.
- added parameter code
- add QualityChecksPlugin class
(V4.1.4) September 14, 2012
- Added MseI enzyme
- calculate diagonal as true IBS
- modified code to determine if snps are likely from the same tag
- add code to ignore lines with family=NA
- added function to optionally calculate diagonal as IBS rather than setting it to 0
- Corrected initialCutSiteRemnant for ApoI and likelyReadEnd for BamHI.
- Implement Allele Presence Method
- Implemented Allele Presence Methods.
- Fixed constructor
- Added public long cardinality(int index);
(V4.1.3) September 6, 2012
- Fixed bug when importing Hapmap file with number of taxa even multiple of 64.
- House keeping to ant build script
- Added Enzyme BamHI
- Added enzyme ApoI
- Added new File type ""Text"" to use for exporting Report class as simply text.
(V4.1.2) August 30, 2012
- Minor optimizations to transpose
- Muti-threaded transpose(BitSet[]) method. And added progress reporting to it also.
- Added options -tree -treeSaveDistance for creating distance trees.
- Added support to export class implementing the Report interface
- Formatting. Removed ""Save taxa groups"" checkbox that isn't used.
- Reimplemented transpose() function to use bit operations.
- Added getBits(int) method to return single long at index.
- LD display dialog code clean up, removed geno/chromo view options, added lock Y to X scroll bar option
- implemented improved clustering for determining haplotypes
- added setters for parameters, fixed error in setParameters
- Added getDepthForAllele() method.
- Removed unused method.
- Added optimizeForSites() and optimizeForTaxa()
- Fixed imports to deal with fusion of SBit
- HDF5 version of SBit alignment
- Preliminary stub for SAM reader and mapper
- Creating TOPM interface. Probably still need an abstract TOPM.
- Slight modifications to expanding 64-bit imputation, still not ready for real use.
- Corrected casting problem.
- Replaced SBitAlignment and TBitAlignment with BitAlignment. Also added methods Alignment.optimizeForTaxa() and Alignment.optimizeForSites().
- LD dialog update, removed permutation number from constructors
(V4.1.1) August 23, 2012
- Fixed public byte[] getAlleles(int site). result needs to be size max num alleles.
- Fixed calculation of number missing gametes and proportion of heterozygotes.
- LD dialog redesign, removed rapid impute from LD constructors
- Removed unused comment
- Moved method bits2words() from OpenBitSet to BitUtil
- Added transpose method
- Added method to count number of lines in file.
- Changes to error messages only.
- Changes to comments only.
- Fixed syncronization problem when loading Hapmap as TBitAlignment. Plus other performance tuning.
- Improved export of Hapmap to only print single major allele if minor allele is unknown.
- Added public void setLong(int wordNum, long bits);
(V4.1.0) (Begin GBS Migration) August 17, 2012
- Added -optimizeForTaxa to pipeline which creates TBitAlignment instead of default SBitAlignment when loading a Hapmap file.
- Added new method for importing Hapmap files that builds Bit Sets as it's parsing the file.
- Added constructors to allow creation with pre-build Bit Sets and Allele freq. lists. This is used by import Hapmap.
- remove archaeopteryx import
- removed archaeopteryx plugin
- new self-describing plugin that extracts a subset of lines from a hapmap file without creating an alignment
- added getBufferedWriter functions that will optionally write gzip files
- Added support to export hapmap files gzipped (i.e. .hmp.txt.gz).
- Changed getType() method name, so that it will compile in Java 7.
- LD code cleanup, synced with TASSEL3. LD plot update, window resize now waits to redraw.
- Added LD minimap, tooltips in plots enabled.
- LD scroll bars functional, added code for future minimap
(V4.0.28) August 9, 2012
- Added support for importing .hmp.txt.gz files.
- Increased timeout to 10 minutes just to make sure.
- LD plot now supports all 4 LD options. Scroll bars still not implemented.
- Increased timeout value to 5 minutes for processing sites just in case very large file. Also added printStackTrace for more information.
- Added getAllelesSortedByFrequency() for single site and getDiploidValues()
- Alignment backed by HDF5 file structure
- Added constants for A, C, and M
- LD Component, work in progress
- Changed TableReport method sigs back to type int
- Updated LD
- Formatting, removed unused imports, removed main(), removed Class DummyReport
- Add commons math library jar
- Add UNKNOWN constant.
- Add method to combine two allele values into one byte.
- Add constants for GAP.
- Corrected perl scripts
- Add logging to show actual max heap size.
- Revert ""LD row and column from index methods""
- New Jar
- Potential methods for LD redesign
(V4.0.27) August 2, 2012
- Modified perl scripts to use min/max heap size defaults individually.
- Add logging to Tassel Pipeline to show actual max heap size.
- Updated perl scripts to set $top variable correctly for execution from any locat ion. Also added ability to specify min and max heap size from command line.
- Uses SNPIDs if no Locus for labels.
(V4.0.26) July 26, 2012
- LD row and column from index methods
- Adding gbs/gwas packages to build.
- changes to which parameters are set by user
- changed method to convert nucleotides to numbers for clustering
- changes to command line options
- Added check if for isInteractive() when reporting error message.
(V4.0.25) July 12, 2012
- Added -versionComment and -versionTag options.
- Add support for importing tab delimited files as TableReports.
- comment out debug print in Viterbi function
- check for NumberFormatException
- corrects a few bugs
(V4.0.24) June 28, 2012
- Added check if one file and one data, then don't add number to filename.
- Changed saveDataToFile() method to use correct TableReportUtils.saveDelimitedTableReport instead of old toDelimitedString(). Also removed saveFiles variable which isn't used.
- Added comment
- Removed Save button. Other preferred ways to save/export now.
- Add file extensions for Table Report export.
- Added setSaveFile(String) convenience method.
- Manhattan plot matches colors in event of missing chromosomes
(V4.0.23) (Includes Prior Versions Also) June 26, 2012
- added code to identify parent alleles in every 100 snp window across genome
- Added options for Diversity Analysis.
- Manhattan plot orders chromosomes.
- Removed convert alignment plugin. not ready for that.
- Adjustments to Control Buttons to wrap based on window size.
- Better logging message.
- Optimization of getAlleles() method and fixed problem with AbstractAlignment.getMajorMinorCounts() after site filtering.
- Minor correction and optimization to getAlleleEncodings() method.
- modifications to imputation pipeline that provide improved configurable logging and changes to log output
- Changed Tassel Pipeline -export option (ExportMultiplePlugin) to accept no parameter resulting in the filenames being the name of the DataSets being exported. Also added logging to show filenames that DataSets are being exported to in all cases.
- Changed ExportUtils methods to return filename that was written to.
- Added flag -separate to invoke SeparatePlugin.
- Changed SeparatePlugin to return everything (DataSet) as a unit instead of one at a time. Better for -export at command line since -export knows to number multiple datums when writing. If one fired at a time, file keeps getting overwritten.
- Corrected SeparatePlugin for Alignments in Tassel 4. Before in Tassel 3 Alignment was only 1 chromosome except CombineAlignment. Now all Alignments support multiple chromosomes. The function now uses FilterAlignment to separate individual Alignments for each chromosome.
- Added javadoc to getInstance() method.
- Added MergeAlignmentsSameSitesPlugin class which merges Hapmap files quickly assuming all sites match exactly.
- Added findNthOccurrenceInString() method.
- new or updated classes the implement Plugins for imputation
- corrects adjusted probability calculation in setNode. Needed to divide by 2 when calculating m.
- made a variety of changes and corrections. Still a work in progress
- moved Population class in NucleotideImputor to independent PopulationData class to work with plugins better
- Better handling of single allele value to homozygote value in AlignmentUtils.setDataBytes() method
- modified setDataBytes to handle data that contains "":""
- implemented reading polymorphism format for TASSEL 4
- Added areEncodingsEqual() utility method to compare alignment encodings.
- Added options to -retainRareAlleles, -maxAllelesToRetain, -mergeAlignments, -exportType HapmapDiploid
- New Plugin that merges multiple alignments. Alignments can have differing site names and taxa. Any combination of site/taxa not defines in one of the alignments is set to UNKNOWN.
- Added File Type HapmapDiploid for exporting Hapmap files with diploid values (i.e. not IUPAC codes) from the command line.
- Added File Type HapmapDiploid for exporting Hapmap files with diploid values (i.e. not IUPAC codes) from the command line.
- Added myTree.setLargeModel(true) to hopefully prevent data tree nodes from being displayed like this ""mdp_g""
- Added Locus.equals() method.
- update with improved scoreParents code that works with nasty rice backcross population
- add code that generates a sorted index. The index indicates the sorted order of an array, but the array itself is not reordered.
- latest changes to imputation code
- Test algorithms for imputation
- Corrected initialization of end site (myEnd) when running from command line.
- QQ plot dialog update. reworked how QQ plot simples plotting. -Jon
- Refactored MutableNucleotideAlignment moving most implementation to MutableSingleEncodeAlignment to remove requirement to be Nucleotide data except only in MutableNucleotideAlignment
- Corrected exception message.
- debug scoreParents function
- -excludeSiteNames Filters input alignment to exclude site names specified in file. The site names cannot have spaces. Individual site names should be separated by white space.
- -excludeSiteNames Filters input alignment to exclude site names specified in file. The site names cannot have spaces. Individual site names should be separated by white space.
- made changes to parent imputation
- Print Tassel Version and Build Date at beginning of pipeline execution.
- bringing changes in NucleotideImputor up to date. Still a work in progress.
- Added private static final long serialVersionUID = -5197800047652332969L; for better compatibility between builds.
- Added ability to read/write serialized alignment objects.
- Removing gwas files. These were accidentally included into Tassel 4 before ready.
- Imputation method now use precomputed Chi square distributions
- Added -genotypeSummary
- Added check if data is null.
- changed readFromHapmap method call since the old one disappeared
- add function to cut the tree at a specific height
- adding imputation code to TASSEL v4
- Added code to show error message when can't write allele values (i.e. A:Z) to hapmap file in single letter codes.
- Added ability to import files with Z (rare) code. And removed possibility for '?' being exported to file.
- Use Max alleles and retain rare alleles prefs when loading Fasta files.
- Using new Alignment preferences when loading hapmap files.
- Added preferences for when loading alignments Can set max number of alleles retained. and whether to retain rare alleles.
- Imputation should expanding 64bp windows
- Removed unused methods.
- Removed unused AlignmentPlugin prefs.
- Delete unused class
- Added -excludeTaxaInFile
- Fixing bugs in ProjectionAlignment
- Fixing a bug that let null in ProjectionAlignment
- Fixing bugs in ProjectionAlignment
- Added Taxa Names to Taxa Summary.
- Completing methods for mutablenucleotidealignment
- More complete pipeline for projection
- Methods for imputing on HapMap files
- Method for calculate genetic distance from bit set
- SNP comparison methods between alignment
- Added isHeterozygous() utility method.
- Corrected accumulative report for r values equal to 1.0.
- Add better Double string formatting when exporting TableReports to file.
- Alignment comparisons
- Implemented isIndel()
- Fixed imputation and binary search
- Corrections to setting positions
- Better reporting of errors resulting from processing SNPs in different threads.
- Requiring this to be Nucleotide Data.
- Improvement to isSBitFriendly() and isTBitFriendly()
- Work on ProjectionAlignment
- Remove getDiploidIdentity(). Unneeded in Tassel 4.0 as it supports Hets.
- Removed -ck options that handled Hets differently. 4.0 is designed to handle Hets all the time.
- remove code that handles hets differently
- remove code to treat hets differently
- Methods for imputation and projection of low density maps to high density maps. Not functional yet.
- Translate -+ or +- to 0
- Corrected clean() method to reset isDirty to false.
- Added ConvertAlignmentCoordinatesPlugin.
- Corrected initialization of loci and physical positions.
- Corrected handling of SNP ids.
- Corrected error checking condition
- New Design of MutableNucleotideAlignment
- Added constructor for MutableAlignment to use.
- setLocationRelativeTo(getParentFrame())
- -ckModelHets Sets how to model heterozygotes. Choose default type RelateHomo (Related to Homozygotes) or IndepState (Independent allele state). -ckRescale true | false Set whether to rescale results between 2 and 0. Default is true.
- Changed recognize hets to default true.
- Added -glmPermutations option
- Added options to Genotype Summary to select which summaries to calculate.
- Fixed problem with returning original alignment if filtering not needed.
- Changed getSiteOfPhysicalPosition() to use first loci if input is locus is null. FilterAlignmentPlugin assumes only one locus in Alignment unless CombineAlignment. Probably need to update FilterAlignmentPlugin.
- Improved factory method when convert from one alignment to another.
- Added Option to convert between sbit and tbit alignments.
- Remove unneeded code
- Removed logging statement
- Added Taxa summary report
- Added icons to data tree to show whether Alignment Data Set is optimized for taxa/site operations.
- Added isSBitFriendly() and isTBitFriendly()
- Some clean up
- Added methods public int getTotalGametesNotMissingForTaxon(int taxon); public int getHeterozygousCountForTaxon(int taxon);
- Removed unused code.
- Changed to getMajorMinorCounts() which counts major/minor allele combinations per site.
- Added getMajorMinorDiploidssSortedByFrequency(int site) and getMajorMinorDiploidCounts()
- Changed some counter from int to long
- Corrected problem with missing proportions.
- Added getDiploidssSortedByFrequency() and modified getDiploidCounts() to return Object[][] and sort.
- Added more Alignment interface methods to call Base alignment implementations instead of AbstractAlignment. Specific Alignment implementations may be faster.
- Added and Corrected methods that operate on a whole site. If the filter hasn't removed any taxa, the base alignment implementation can be called with the translated site number. If taxa have been removed, the AbstractAlignment to ensure correct results.
- Added fast implementation of getHeterozygousCount()
- Couple optimization changes.
- Added fast implementation for isPolymorphic()
- Better error checking for getMajorAlleleCount and implementation for getMajorAlleleFrequency()
- Improved implementation of getMajorAlleleCount()
- Slight optimization of getIncludedSitesBasedOnFreqIgnoreMissing()
- And improved error checking for getMinorAlleleCount()
- Improved getMinorAlleleCount() implementation
- Added bit wise implementation of getAllelesSortedByFrequency()
- Run super.getDiploidCounts() is multiple allele state mappings.
- Changed algorithm for counting diploids in Nucleotide data sets
- Added methods getDiploidAsString() and getDiploidCounts()
- Better handling of creating an Alignment from another Alignment
- Add icon
- icon for GenotypeSummaryPlugin
- Moved GenotypeSummaryPlugin to Analysis Control Panel
- Added implementation of getTotalGametesNotMissing()
- Corrected problem with initializing global variable for multiple executions. Added ""site name"" and ""physical position"". Corrected spelling of Proportion
- Add Genotype Summary button to Data control panel.
- Added -includeSiteNames and -includeSiteNamesInFile options.
- Added ability to filter from list of site names.
- New getInstance() method that includes list of site names.
- Plink export bug fix.
- Additional stats reported.
- Added UNKNOWN_DIPLOID_ALLELE constant
- Right justify common types.
- Formatting, better error checking, etc.
- Added faster isHeterozygous() implementation over AbstractAlignment implementation.
- Added Proportion Heterozygous reporting
- Added method getHeterozygousCount()
- Add method getMajorAlleleCount()
- Changed method name getTotalCountNotMissing() to getTotalGametesNotMissing(). Also in site filter, doubling user given min. count when comparing against total gametes not missing.
- Implementing more Alignment interface methods.
- Corrected problems with getAlleleEncodings()
- Implemented several Alignment interface methods.
- Migrating Jon's scrolling view of LD Plot
- Added option whether to retain taxa if unknown when creating subset using FilterAlignment. The FilterTaxaAlignmentPlugin will not retain unknown taxa. The Union join will retain unknown taxa.
- Added -numericalGenoTransform option for numerical genotype transforms.
- Add Max Freq field to Site Alignment Dialog
- Added function to the LD GUI that allows SiteByAll and SiteList (from FilterAlignment) to be used.
- Added method getBaseSitesShown()
- Added getMajorAlleleAsString() and getMinorAlleleAsString() convenience methods.
- Search accepts numbers in E notation
- Added ability to accumulate LD R2 results into distribution bins.
- Added command line ability to filter alignment based on max freq.
- Removed use of Dimension object to set image size in AbstractDisplayPlugin. This is to prevent Exception thrown when using these display plugins from the command line on a remote machine.
- Corrected reporting of progress. integer calculation was reporting 0 everytime
- Minor bug fixes for plotting options.
- Fixed Progress Reporting
- Removed println statement
- Implemented CombineAlignment.getLoci() and getLociOffsets()
- Rewrote getRow() method to return appropriate data types instead of all Strings.
- Added public String getFullTaxaName(int index)
- Fix loci offsets by initializing myLociOffsets in constructor
- Added includeantruntime=""false"" to prevent warning message.
- Added support for AlignmentUtils.removeSitesOutsideRange
- Corrected spelling of presence in getAllelePresenceForSitesBlock() and getAllelePresenceForAllSites()
- Add fast bit methods to IBSDistanceMatrix, works for hets now. Actually faster than before.
- Corrected problem with returned loci offsets
- Removed allele freq. methods to allow abstract class methods to execute. That produces correct result.
- Added 'X' as possible character from Nucleotide files that gets translated to 'N'
- Implemented getLociOffsets()
- Implemented addition methods.
- Implemented getAlleleEncodings() method.
- rewrote getIncludedSitesBasedOnFreqIgnoreMissing() to use getTotalCountNotMissing() and getMinorAlleleFrequency() for better performance
- I've added these methods to the Alignment interface... And implemented them in the abstract class.
- Add bit methods to LinkageDisequilibrium. Only for inbreds.
- Add bit methods to IBSDistanceMatrix. Only for inbreds.
- Fixing functionality to filter my Min. Freq and Min Count.
- Removed getBaseAsStringArray() method, so that Abstract class is used. This needs to return separated allele values. Only makes sense to combine for getBaseAsString()
- Ed's test with Git commit with IBSDistanceMatrix
- correct data classes used by FixedEffects plugin