> Actually i'm storing the bits from a substructure bit file into an
> arraylist and in another arraylist i'm storing the common bits of all sdf
> molecules ,, and then comparing to find out uncommon bits . i'm trying it
> using remove and retain methods in array list.
Do not do this, use a bit set. It was invented to handle exactly these
situations. The xor operation gives you the difference (uncommon bits) the and
operations give you the common bits.
In Java 7 these to methods will give you the raw bits which you can then write
as hex to store.
BitSet.toByteArray()
BitSet.toLongArray()
You can get a binary string as shown below:
BitSet s =
String padding = "0000000…000"; // 64 0's
for(long x : s.toLongArray()){
String bits = Long.toBinaryString(x);
String padded = padding.subString(0, 64 - bits.length()) + bits;
// append padded to a StringBuilder
}
or
BitSet s = …
StringBuilder sb = new StringBuilder();
// len = fingerprint length
for (int i = 0, len = 1024; i < len; i++)
sb.append(s.get(i) ? '1' : '0');
String bits = sb.toString();
See also
http://en.wikipedia.org/wiki/Bit_array
On 11 Jun 2013, at 06:35, Gauri S <[email protected]> wrote:
>
> Sir John ,
> Actually i'm storing the bits from a substructure bit file into an
> arraylist and in another arraylist i'm storing the common bits of all sdf
> molecules ,, and then comparing to find out uncommon bits . i'm trying it
> using remove and retain methods in arraylist.
>
> Sir, can you please tell how to get bits as 1 0 1 0 format instead of the
> bits as 0,1,2,3,4,,, and so on
>
> Thanks
> Gauri
>
>
>
> John May-4 wrote:
>>
>> Yep you need to take the complement of the bits. If your working on the
>> binary it's the unary '~' operator on the BitSet it's the method named
>> flip.
>>
>> BitSet set = fingerprint.asBitSet(); // 1.5.x
>> set.flip();
>>
>> J
>>
>> On 6 Jun 2013, at 10:49, Gauri S <[email protected]> wrote:
>>
>>>
>>> Sir , can we find those bits which are missing,
>>> eg suppose my molecule is giving me fingerprint like this
>>> {0, 1, 2, 9, 10, 11, 12, 14, 18, 19, 20, 33, 143, 145, 146, 178, 182,
>>> 184,
>>> 185, 189, 192, 283, 284, 285, 286, 293, 299, 308, 332, 333, 337, 341,
>>> 344,
>>> 346, 349, 351, 352, 353, 355, 356, 366, 368, 370, 371, 374, 381, 382,
>>> 384,
>>> 390, 392, 393, 405, 406, 412, 416, 420, 430, 434, 439, 441, 443, 446,
>>> 451,
>>> 470, 476, 489, 490, 498, 507, 516, 520, 524, 535, 541, 542, 548, 552,
>>> 556,
>>> 564, 565, 567, 570, 573, 574, 578, 579, 582, 584, 586, 589, 590, 592,
>>> 594,
>>> 595, 599, 603, 604, 606, 608, 613, 614, 617, 618, 619, 620, 626, 632,
>>> 634,
>>> 637, 640, 641, 643, 645, 650, 651, 655, 660, 662, 664, 666, 667, 668,
>>> 677,
>>> 678, 679, 680, 681, 683, 684, 688, 689, 692, 696, 697, 698, 699, 704,
>>> 708,
>>> 709, 710, 712, 713, 714, 719, 734, 735, 755, 756, 776, 777, 782, 797,
>>> 798,
>>> 818, 819}
>>>
>>> can i find the missing bits eg 3,4,5,6,7,8 ,,,,,, and so on ???
>>>
>>>
>>>
>>> John May-4 wrote:
>>>>
>>>> No problem,
>>>>
>>>> Did you see the resolution? You need to add the implicit hydrogens,
>>>> otherwise the matching isn't done correctly.
>>>>
>>>> Thanks,
>>>> J
>>>>
>>>> On 27 May 2013, at 08:31, Gauri S <[email protected]> wrote:
>>>>
>>>>>
>>>>> Sir John ,
>>>>> I'm using Substructure fingerprinter
>>>>>
>>>>>
>>>>>
>>>>> John May-4 wrote:
>>>>>>
>>>>>> Hi Gauri,
>>>>>>
>>>>>> Which fingerprinter are you using?
>>>>>>
>>>>>> J
>>>>>>
>>>>>> On 22 May 2013, at 13:45, Gauri S <[email protected]> wrote:
>>>>>>
>>>>>>>
>>>>>>> I have used SmilesParser to parse through the smile and generate the
>>>>>>> fingerprint , it prints
>>>>>>> fingerprints of query:{0, 1, 17, 87, 95, 142, 168, 273, 274, 294,
>>>>>>> 299,
>>>>>>> 300,
>>>>>>> 301, 306}
>>>>>>>
>>>>>>> when i used sdf file of same molecule and generated fingerprint , it
>>>>>>> prints
>>>>>>> bitsetarray: [{17, 87, 95, 142, 273, 274, 294, 301, 306}]
>>>>>>>
>>>>>>> even if it same molecule , still it does not consider 0,1,168,299,300
>>>>>>> bits
>>>>>>>
>>>>>>> So, can anyone please tell me why is this difference and which method
>>>>>>> is
>>>>>>> suitable to get the results properly?
>>>>>>>
>>>>>>> this is my small part of the code
>>>>>>>
>>>>>>> ArrayList<IMolecule> molList= new ArrayList< IMolecule >();
>>>>>>> ArrayList<BitSet> bitsetarray= new ArrayList< BitSet
>>>>>>> >();
>>>>>>> ArrayList<BitSet> bitsetarray1= new ArrayList< BitSet
>>>>>>> >();
>>>>>>> ArrayList<String> molidarray= new ArrayList< String >();
>>>>>>> ArrayList<String> molidarray1= new ArrayList< String
>>>>>>> >();
>>>>>>> //ArrayList<String> molidarray2= new ArrayList< String
>>>>>>> >();
>>>>>>> IMolecule molecule = null;
>>>>>>> String query =
>>>>>>> "CC1=C(C)C2=C(CCC(C)(COC3=CC=C(CC4SC(=O)NC4=O)C=C3)O2)C(C)=C1O";
>>>>>>> SmilesParser sp = new
>>>>>>> SmilesParser(DefaultChemObjectBuilder.getInstance());
>>>>>>>
>>>>>>> IAtomContainer mol1 = sp.parseSmiles(query);
>>>>>>> mol1 = new AtomContainer(mol1);
>>>>>>> BitSet fingerprint1 = fprinter.getFingerprint(mol1);
>>>>>>> System.out.println("fingerprints of
>>>>>>> query:"+fingerprint1);
>>>>>>>
>>>>>>>
>>>>>>> File sdfFile = new File("D:/gauri/cdk/Vasodilator/DB00197.sdf");
>>>>>>>
>>>>>>> IteratingMDLReader reader = new IteratingMDLReader(
>>>>>>> new FileInputStream(sdfFile),
>>>>>>> DefaultIChemObjectBuilder.getInstance());
>>>>>>>
>>>>>>> System.out.println("Reading the file...");
>>>>>>> while (reader.hasNext()) {
>>>>>>> molecule = (IMolecule)reader.next();
>>>>>>> molList.add(molecule);
>>>>>>> fingerprint = fprinter.getFingerprint(molecule);
>>>>>>> // fprinter.getSize(); // returns 881
>>>>>>> //fingerprint.length(); // returns the highest set
>>>>>>> bit
>>>>>>> bitsetarray.add(fingerprint);
>>>>>>>
>>>>>>>
>>>>>>> molidarray.add(molecule.getProperty("DRUGBANK_ID").toString());
>>>>>>> //
>>>>>>> molidarray2.add(molecule.getProperty("SMILES").toString());
>>>>>>>
>>>>>>> }
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://old.nabble.com/fingerprints-generated-differently-for-same-molecules-using-different-methods---smileparser-and-reading-the-sdf-file-using-IteratingMDLReader-tp35424370p35424370.html
>>>>>>> Sent from the cdk-user mailing list archive at Nabble.com.
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> Try New Relic Now & We'll Send You this Cool Shirt
>>>>>>> New Relic is the only SaaS-based application performance monitoring
>>>>>>> service
>>>>>>> that delivers powerful full stack analytics. Optimize and monitor
>>>>>>> your
>>>>>>> browser, app, & servers with just a few lines of code. Try New Relic
>>>>>>> and get this awesome Nerd Life shirt!
>>>>>>> http://p.sf.net/sfu/newrelic_d2d_may
>>>>>>> _______________________________________________
>>>>>>> Cdk-user mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Try New Relic Now & We'll Send You this Cool Shirt
>>>>>> New Relic is the only SaaS-based application performance monitoring
>>>>>> service
>>>>>> that delivers powerful full stack analytics. Optimize and monitor your
>>>>>> browser, app, & servers with just a few lines of code. Try New Relic
>>>>>> and get this awesome Nerd Life shirt!
>>>>>> http://p.sf.net/sfu/newrelic_d2d_may
>>>>>> _______________________________________________
>>>>>> Cdk-user mailing list
>>>>>> [email protected]
>>>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://old.nabble.com/fingerprints-generated-differently-for-same-molecules-using-different-methods---smileparser-and-reading-the-sdf-file-using-IteratingMDLReader-tp35424370p35556643.html
>>>>> Sent from the cdk-user mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Try New Relic Now & We'll Send You this Cool Shirt
>>>>> New Relic is the only SaaS-based application performance monitoring
>>>>> service
>>>>> that delivers powerful full stack analytics. Optimize and monitor your
>>>>> browser, app, & servers with just a few lines of code. Try New Relic
>>>>> and get this awesome Nerd Life shirt!
>>>>> http://p.sf.net/sfu/newrelic_d2d_may
>>>>> _______________________________________________
>>>>> Cdk-user mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Try New Relic Now & We'll Send You this Cool Shirt
>>>> New Relic is the only SaaS-based application performance monitoring
>>>> service
>>>> that delivers powerful full stack analytics. Optimize and monitor your
>>>> browser, app, & servers with just a few lines of code. Try New Relic
>>>> and get this awesome Nerd Life shirt!
>>>> http://p.sf.net/sfu/newrelic_d2d_may
>>>> _______________________________________________
>>>> Cdk-user mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/fingerprints-generated-differently-for-same-molecules-using-different-methods---smileparser-and-reading-the-sdf-file-using-IteratingMDLReader-tp35424370p35596445.html
>>> Sent from the cdk-user mailing list archive at Nabble.com.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> How ServiceNow helps IT people transform IT departments:
>>> 1. A cloud service to automate IT design, transition and operations
>>> 2. Dashboards that offer high-level views of enterprise services
>>> 3. A single system of record for all IT processes
>>> http://p.sf.net/sfu/servicenow-d2d-j
>>> _______________________________________________
>>> Cdk-user mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>> ------------------------------------------------------------------------------
>> How ServiceNow helps IT people transform IT departments:
>> 1. A cloud service to automate IT design, transition and operations
>> 2. Dashboards that offer high-level views of enterprise services
>> 3. A single system of record for all IT processes
>> http://p.sf.net/sfu/servicenow-d2d-j
>> _______________________________________________
>> Cdk-user mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>
> --
> View this message in context:
> http://old.nabble.com/fingerprints-generated-differently-for-same-molecules-using-different-methods---smileparser-and-reading-the-sdf-file-using-IteratingMDLReader-tp35424370p35611310.html
> Sent from the cdk-user mailing list archive at Nabble.com.
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Cdk-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/cdk-user
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user