Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-31 Thread Lauren McIver
Hi Tin,

Perfect! The accepted pull requests are what I was referring to in my prior
email. Sorry I was unclear! The changes were all merged in main before I
removed my forked repository.

Thank you,
Lauren


On Wed, Aug 31, 2016 at 1:33 PM, Duy Tin Truong 
wrote:

> Thanks Lauren. Actually, I only merged the requests that you sent and the
> files you uploaded in the Download page are still in the main repository.
>
> Cheers,
> Tin
>
> On Wed, Aug 31, 2016 at 6:43 PM Lauren McIver 
> wrote:
>
>> Hi Tin,
>>
>> I deleted the fork since as you mentioned the changes were merged into
>> the main repository.
>>
>> Thanks!
>> Lauren
>>
>>
>> On Wed, Aug 31, 2016 at 1:31 AM, Duy Tin Truong 
>> wrote:
>>
>>> Hi Andreas,
>>>
>>> That is the fork from Lauren and we merged the change into the main
>>> repository. The main repository is still at:
>>> https://bitbucket.org/biobakery/metaphlan2/overview
>>>
>>> Thanks,
>>> Tin
>>>
>>> On Wed, Aug 31, 2016 at 9:11 AM Andreas Tille  wrote:
>>>
 Hi again,

 I'm pretty sure I have seen the 2.6.0 tag right after your mail.  I was
 a bit busy since then and wanted to have a look now.  I noticed that
 metaphlan2 moved to another user on bitbucket and

 https://bitbucket.org/ljmciver/metaphlan2/downloads?tab=tags

 only shows 2.5.0 as latest tag.

 Am I missing something?

 Kind regards

  Andreas.

 On Fri, Aug 19, 2016 at 12:18:31PM +0200, Andreas Tille wrote:
 > Hi Duy,
 >
 > thanks for the information.  It might take some time since I'll be a
 bit
 > offline-ish next week but now I know what the correct target for the
 > package will be.
 >
 > Kind regards
 >
 >Andreas.
 >
 > On Fri, Aug 19, 2016 at 07:17:25AM +, Duy Tin Truong wrote:
 > > Hi Andreas,
 > >
 > > Thanks for your explanation. We have officially updated metaphlan2
 to
 > > version 2.6.0 as shown in tags.
 > > So, when it is convenient for you, please help to add the package
 of this
 > > new version.
 > >
 > > Many thanks,
 > > Tin
 > >
 > > On Fri, Aug 12, 2016 at 6:31 PM Andreas Tille 
 wrote:
 > >
 > > > Hi Tin,
 > > >
 > > > On Fri, Aug 12, 2016 at 03:30:25PM +, Duy Tin Truong wrote:
 > > > > > However the hint ot hclust2[1] is helpful.  Unfortunately I
 can not
 > > > find
 > > > > > any description for this software.  Since you might have some
 > > > influence on
 > > > > > this it would be great to provide a hint where I can find a
 > > > description for
 > > > > > a potential package.
 > > > > >
 > > > > hclust2 is used to plot heat-maps and not directly used in
 metaphlan2.py
 > > > or
 > > > > strainphlan.py. In other words, metaphlan2 does not depend
 tightly on
 > > > this
 > > > > tool. However, I will update the wiki page later.
 > > >
 > > > Sounds like a neat tool anyway - so if there is a description I
 could
 > > > provide a package.
 > > >
 > > > > > > and here for strainphlan (another brother tool uses the
 same database
 > > > > > with
 > > > > > > metaphlan2 and both are in the same repository and should go
 > > > together,
 > > > > > > strainphlan is in strainphlan.py and metaphlan2 is in
 metaphlan2.py):
 > > > > > >
 > > > > >
 > > > https://bitbucket.org/biobakery/metaphlan2#markdown-
 header-pre-requisites_1
 > > > > >
 > > > > > Well, the download file for metaphlan2 version 2.5 has
 strainer_src and
 > > > > > metaphlan2_strainer.py - is this what you mean?
 > > > > >
 > > > > Yes, strainer_src is now strainphlan_src and
 metaphlan2_strainer.py is
 > > > now
 > > > > strainphlan.py. As I mentioned before, it is better to use the
 latest
 > > > > version of the repository now because the tutorial now fits
 with the new
 > > > > names:
 > > > >
 > > > https://bitbucket.org/biobakery/metaphlan2#markdown-
 header-metagenomic-strain-level-population-genomics
 > > > >
 > > > > and we may not change them again :).
 > > >
 > > > So *if* you want to let users use the latest state of the
 repository why
 > > > don't you do a new versioned release to make it official.  Debian
 uses a
 > > > system to check web pages for versioned releases.  We can not
 sneak into
 > > > each repository nor wild guessing if it is a stable commit or
 not.  Is
 > > > there any reason not to release say version 2.6 or 2.5.1 or
 whatever?
 > > >
 > > > Kind regards
 > > >
 > > >   Andreas.
 > > >
 > > > > > [1] https://bitbucket.org/nsegata/hclust2
 > > >
 > > > --
 > > > http://fam-tille.de
 > > >
 >
 > 

Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-31 Thread Duy Tin Truong
Thanks Lauren. Actually, I only merged the requests that you sent and the
files you uploaded in the Download page are still in the main repository.

Cheers,
Tin

On Wed, Aug 31, 2016 at 6:43 PM Lauren McIver 
wrote:

> Hi Tin,
>
> I deleted the fork since as you mentioned the changes were merged into the
> main repository.
>
> Thanks!
> Lauren
>
>
> On Wed, Aug 31, 2016 at 1:31 AM, Duy Tin Truong 
> wrote:
>
>> Hi Andreas,
>>
>> That is the fork from Lauren and we merged the change into the main
>> repository. The main repository is still at:
>> https://bitbucket.org/biobakery/metaphlan2/overview
>>
>> Thanks,
>> Tin
>>
>> On Wed, Aug 31, 2016 at 9:11 AM Andreas Tille  wrote:
>>
>>> Hi again,
>>>
>>> I'm pretty sure I have seen the 2.6.0 tag right after your mail.  I was
>>> a bit busy since then and wanted to have a look now.  I noticed that
>>> metaphlan2 moved to another user on bitbucket and
>>>
>>> https://bitbucket.org/ljmciver/metaphlan2/downloads?tab=tags
>>>
>>> only shows 2.5.0 as latest tag.
>>>
>>> Am I missing something?
>>>
>>> Kind regards
>>>
>>>  Andreas.
>>>
>>> On Fri, Aug 19, 2016 at 12:18:31PM +0200, Andreas Tille wrote:
>>> > Hi Duy,
>>> >
>>> > thanks for the information.  It might take some time since I'll be a
>>> bit
>>> > offline-ish next week but now I know what the correct target for the
>>> > package will be.
>>> >
>>> > Kind regards
>>> >
>>> >Andreas.
>>> >
>>> > On Fri, Aug 19, 2016 at 07:17:25AM +, Duy Tin Truong wrote:
>>> > > Hi Andreas,
>>> > >
>>> > > Thanks for your explanation. We have officially updated metaphlan2 to
>>> > > version 2.6.0 as shown in tags.
>>> > > So, when it is convenient for you, please help to add the package of
>>> this
>>> > > new version.
>>> > >
>>> > > Many thanks,
>>> > > Tin
>>> > >
>>> > > On Fri, Aug 12, 2016 at 6:31 PM Andreas Tille 
>>> wrote:
>>> > >
>>> > > > Hi Tin,
>>> > > >
>>> > > > On Fri, Aug 12, 2016 at 03:30:25PM +, Duy Tin Truong wrote:
>>> > > > > > However the hint ot hclust2[1] is helpful.  Unfortunately I
>>> can not
>>> > > > find
>>> > > > > > any description for this software.  Since you might have some
>>> > > > influence on
>>> > > > > > this it would be great to provide a hint where I can find a
>>> > > > description for
>>> > > > > > a potential package.
>>> > > > > >
>>> > > > > hclust2 is used to plot heat-maps and not directly used in
>>> metaphlan2.py
>>> > > > or
>>> > > > > strainphlan.py. In other words, metaphlan2 does not depend
>>> tightly on
>>> > > > this
>>> > > > > tool. However, I will update the wiki page later.
>>> > > >
>>> > > > Sounds like a neat tool anyway - so if there is a description I
>>> could
>>> > > > provide a package.
>>> > > >
>>> > > > > > > and here for strainphlan (another brother tool uses the same
>>> database
>>> > > > > > with
>>> > > > > > > metaphlan2 and both are in the same repository and should go
>>> > > > together,
>>> > > > > > > strainphlan is in strainphlan.py and metaphlan2 is in
>>> metaphlan2.py):
>>> > > > > > >
>>> > > > > >
>>> > > >
>>> https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1
>>> > > > > >
>>> > > > > > Well, the download file for metaphlan2 version 2.5 has
>>> strainer_src and
>>> > > > > > metaphlan2_strainer.py - is this what you mean?
>>> > > > > >
>>> > > > > Yes, strainer_src is now strainphlan_src and
>>> metaphlan2_strainer.py is
>>> > > > now
>>> > > > > strainphlan.py. As I mentioned before, it is better to use the
>>> latest
>>> > > > > version of the repository now because the tutorial now fits with
>>> the new
>>> > > > > names:
>>> > > > >
>>> > > >
>>> https://bitbucket.org/biobakery/metaphlan2#markdown-header-metagenomic-strain-level-population-genomics
>>> > > > >
>>> > > > > and we may not change them again :).
>>> > > >
>>> > > > So *if* you want to let users use the latest state of the
>>> repository why
>>> > > > don't you do a new versioned release to make it official.  Debian
>>> uses a
>>> > > > system to check web pages for versioned releases.  We can not
>>> sneak into
>>> > > > each repository nor wild guessing if it is a stable commit or
>>> not.  Is
>>> > > > there any reason not to release say version 2.6 or 2.5.1 or
>>> whatever?
>>> > > >
>>> > > > Kind regards
>>> > > >
>>> > > >   Andreas.
>>> > > >
>>> > > > > > [1] https://bitbucket.org/nsegata/hclust2
>>> > > >
>>> > > > --
>>> > > > http://fam-tille.de
>>> > > >
>>> >
>>> > --
>>> > http://fam-tille.de
>>> >
>>> >
>>>
>>> --
>>> http://fam-tille.de
>>>
>>
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-31 Thread Duy Tin Truong
Hi Andreas,

That is the fork from Lauren and we merged the change into the main
repository. The main repository is still at:
https://bitbucket.org/biobakery/metaphlan2/overview

Thanks,
Tin

On Wed, Aug 31, 2016 at 9:11 AM Andreas Tille  wrote:

> Hi again,
>
> I'm pretty sure I have seen the 2.6.0 tag right after your mail.  I was
> a bit busy since then and wanted to have a look now.  I noticed that
> metaphlan2 moved to another user on bitbucket and
>
> https://bitbucket.org/ljmciver/metaphlan2/downloads?tab=tags
>
> only shows 2.5.0 as latest tag.
>
> Am I missing something?
>
> Kind regards
>
>  Andreas.
>
> On Fri, Aug 19, 2016 at 12:18:31PM +0200, Andreas Tille wrote:
> > Hi Duy,
> >
> > thanks for the information.  It might take some time since I'll be a bit
> > offline-ish next week but now I know what the correct target for the
> > package will be.
> >
> > Kind regards
> >
> >Andreas.
> >
> > On Fri, Aug 19, 2016 at 07:17:25AM +, Duy Tin Truong wrote:
> > > Hi Andreas,
> > >
> > > Thanks for your explanation. We have officially updated metaphlan2 to
> > > version 2.6.0 as shown in tags.
> > > So, when it is convenient for you, please help to add the package of
> this
> > > new version.
> > >
> > > Many thanks,
> > > Tin
> > >
> > > On Fri, Aug 12, 2016 at 6:31 PM Andreas Tille 
> wrote:
> > >
> > > > Hi Tin,
> > > >
> > > > On Fri, Aug 12, 2016 at 03:30:25PM +, Duy Tin Truong wrote:
> > > > > > However the hint ot hclust2[1] is helpful.  Unfortunately I can
> not
> > > > find
> > > > > > any description for this software.  Since you might have some
> > > > influence on
> > > > > > this it would be great to provide a hint where I can find a
> > > > description for
> > > > > > a potential package.
> > > > > >
> > > > > hclust2 is used to plot heat-maps and not directly used in
> metaphlan2.py
> > > > or
> > > > > strainphlan.py. In other words, metaphlan2 does not depend tightly
> on
> > > > this
> > > > > tool. However, I will update the wiki page later.
> > > >
> > > > Sounds like a neat tool anyway - so if there is a description I could
> > > > provide a package.
> > > >
> > > > > > > and here for strainphlan (another brother tool uses the same
> database
> > > > > > with
> > > > > > > metaphlan2 and both are in the same repository and should go
> > > > together,
> > > > > > > strainphlan is in strainphlan.py and metaphlan2 is in
> metaphlan2.py):
> > > > > > >
> > > > > >
> > > >
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1
> > > > > >
> > > > > > Well, the download file for metaphlan2 version 2.5 has
> strainer_src and
> > > > > > metaphlan2_strainer.py - is this what you mean?
> > > > > >
> > > > > Yes, strainer_src is now strainphlan_src and
> metaphlan2_strainer.py is
> > > > now
> > > > > strainphlan.py. As I mentioned before, it is better to use the
> latest
> > > > > version of the repository now because the tutorial now fits with
> the new
> > > > > names:
> > > > >
> > > >
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-metagenomic-strain-level-population-genomics
> > > > >
> > > > > and we may not change them again :).
> > > >
> > > > So *if* you want to let users use the latest state of the repository
> why
> > > > don't you do a new versioned release to make it official.  Debian
> uses a
> > > > system to check web pages for versioned releases.  We can not sneak
> into
> > > > each repository nor wild guessing if it is a stable commit or not.
> Is
> > > > there any reason not to release say version 2.6 or 2.5.1 or whatever?
> > > >
> > > > Kind regards
> > > >
> > > >   Andreas.
> > > >
> > > > > > [1] https://bitbucket.org/nsegata/hclust2
> > > >
> > > > --
> > > > http://fam-tille.de
> > > >
> >
> > --
> > http://fam-tille.de
> >
> >
>
> --
> http://fam-tille.de
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-31 Thread Andreas Tille
Hi again,

I'm pretty sure I have seen the 2.6.0 tag right after your mail.  I was
a bit busy since then and wanted to have a look now.  I noticed that
metaphlan2 moved to another user on bitbucket and

https://bitbucket.org/ljmciver/metaphlan2/downloads?tab=tags

only shows 2.5.0 as latest tag.

Am I missing something?

Kind regards

 Andreas.

On Fri, Aug 19, 2016 at 12:18:31PM +0200, Andreas Tille wrote:
> Hi Duy,
> 
> thanks for the information.  It might take some time since I'll be a bit
> offline-ish next week but now I know what the correct target for the
> package will be.
> 
> Kind regards
> 
>Andreas.
> 
> On Fri, Aug 19, 2016 at 07:17:25AM +, Duy Tin Truong wrote:
> > Hi Andreas,
> > 
> > Thanks for your explanation. We have officially updated metaphlan2 to
> > version 2.6.0 as shown in tags.
> > So, when it is convenient for you, please help to add the package of this
> > new version.
> > 
> > Many thanks,
> > Tin
> > 
> > On Fri, Aug 12, 2016 at 6:31 PM Andreas Tille  wrote:
> > 
> > > Hi Tin,
> > >
> > > On Fri, Aug 12, 2016 at 03:30:25PM +, Duy Tin Truong wrote:
> > > > > However the hint ot hclust2[1] is helpful.  Unfortunately I can not
> > > find
> > > > > any description for this software.  Since you might have some
> > > influence on
> > > > > this it would be great to provide a hint where I can find a
> > > description for
> > > > > a potential package.
> > > > >
> > > > hclust2 is used to plot heat-maps and not directly used in metaphlan2.py
> > > or
> > > > strainphlan.py. In other words, metaphlan2 does not depend tightly on
> > > this
> > > > tool. However, I will update the wiki page later.
> > >
> > > Sounds like a neat tool anyway - so if there is a description I could
> > > provide a package.
> > >
> > > > > > and here for strainphlan (another brother tool uses the same 
> > > > > > database
> > > > > with
> > > > > > metaphlan2 and both are in the same repository and should go
> > > together,
> > > > > > strainphlan is in strainphlan.py and metaphlan2 is in 
> > > > > > metaphlan2.py):
> > > > > >
> > > > >
> > > https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1
> > > > >
> > > > > Well, the download file for metaphlan2 version 2.5 has strainer_src 
> > > > > and
> > > > > metaphlan2_strainer.py - is this what you mean?
> > > > >
> > > > Yes, strainer_src is now strainphlan_src and metaphlan2_strainer.py is
> > > now
> > > > strainphlan.py. As I mentioned before, it is better to use the latest
> > > > version of the repository now because the tutorial now fits with the new
> > > > names:
> > > >
> > > https://bitbucket.org/biobakery/metaphlan2#markdown-header-metagenomic-strain-level-population-genomics
> > > >
> > > > and we may not change them again :).
> > >
> > > So *if* you want to let users use the latest state of the repository why
> > > don't you do a new versioned release to make it official.  Debian uses a
> > > system to check web pages for versioned releases.  We can not sneak into
> > > each repository nor wild guessing if it is a stable commit or not.  Is
> > > there any reason not to release say version 2.6 or 2.5.1 or whatever?
> > >
> > > Kind regards
> > >
> > >   Andreas.
> > >
> > > > > [1] https://bitbucket.org/nsegata/hclust2
> > >
> > > --
> > > http://fam-tille.de
> > >
> 
> -- 
> http://fam-tille.de
> 
> 

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-19 Thread Andreas Tille
Hi Duy,

thanks for the information.  It might take some time since I'll be a bit
offline-ish next week but now I know what the correct target for the
package will be.

Kind regards

   Andreas.

On Fri, Aug 19, 2016 at 07:17:25AM +, Duy Tin Truong wrote:
> Hi Andreas,
> 
> Thanks for your explanation. We have officially updated metaphlan2 to
> version 2.6.0 as shown in tags.
> So, when it is convenient for you, please help to add the package of this
> new version.
> 
> Many thanks,
> Tin
> 
> On Fri, Aug 12, 2016 at 6:31 PM Andreas Tille  wrote:
> 
> > Hi Tin,
> >
> > On Fri, Aug 12, 2016 at 03:30:25PM +, Duy Tin Truong wrote:
> > > > However the hint ot hclust2[1] is helpful.  Unfortunately I can not
> > find
> > > > any description for this software.  Since you might have some
> > influence on
> > > > this it would be great to provide a hint where I can find a
> > description for
> > > > a potential package.
> > > >
> > > hclust2 is used to plot heat-maps and not directly used in metaphlan2.py
> > or
> > > strainphlan.py. In other words, metaphlan2 does not depend tightly on
> > this
> > > tool. However, I will update the wiki page later.
> >
> > Sounds like a neat tool anyway - so if there is a description I could
> > provide a package.
> >
> > > > > and here for strainphlan (another brother tool uses the same database
> > > > with
> > > > > metaphlan2 and both are in the same repository and should go
> > together,
> > > > > strainphlan is in strainphlan.py and metaphlan2 is in metaphlan2.py):
> > > > >
> > > >
> > https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1
> > > >
> > > > Well, the download file for metaphlan2 version 2.5 has strainer_src and
> > > > metaphlan2_strainer.py - is this what you mean?
> > > >
> > > Yes, strainer_src is now strainphlan_src and metaphlan2_strainer.py is
> > now
> > > strainphlan.py. As I mentioned before, it is better to use the latest
> > > version of the repository now because the tutorial now fits with the new
> > > names:
> > >
> > https://bitbucket.org/biobakery/metaphlan2#markdown-header-metagenomic-strain-level-population-genomics
> > >
> > > and we may not change them again :).
> >
> > So *if* you want to let users use the latest state of the repository why
> > don't you do a new versioned release to make it official.  Debian uses a
> > system to check web pages for versioned releases.  We can not sneak into
> > each repository nor wild guessing if it is a stable commit or not.  Is
> > there any reason not to release say version 2.6 or 2.5.1 or whatever?
> >
> > Kind regards
> >
> >   Andreas.
> >
> > > > [1] https://bitbucket.org/nsegata/hclust2
> >
> > --
> > http://fam-tille.de
> >

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-19 Thread Duy Tin Truong
Hi Andreas,

Thanks for your explanation. We have officially updated metaphlan2 to
version 2.6.0 as shown in tags.
So, when it is convenient for you, please help to add the package of this
new version.

Many thanks,
Tin

On Fri, Aug 12, 2016 at 6:31 PM Andreas Tille  wrote:

> Hi Tin,
>
> On Fri, Aug 12, 2016 at 03:30:25PM +, Duy Tin Truong wrote:
> > > However the hint ot hclust2[1] is helpful.  Unfortunately I can not
> find
> > > any description for this software.  Since you might have some
> influence on
> > > this it would be great to provide a hint where I can find a
> description for
> > > a potential package.
> > >
> > hclust2 is used to plot heat-maps and not directly used in metaphlan2.py
> or
> > strainphlan.py. In other words, metaphlan2 does not depend tightly on
> this
> > tool. However, I will update the wiki page later.
>
> Sounds like a neat tool anyway - so if there is a description I could
> provide a package.
>
> > > > and here for strainphlan (another brother tool uses the same database
> > > with
> > > > metaphlan2 and both are in the same repository and should go
> together,
> > > > strainphlan is in strainphlan.py and metaphlan2 is in metaphlan2.py):
> > > >
> > >
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1
> > >
> > > Well, the download file for metaphlan2 version 2.5 has strainer_src and
> > > metaphlan2_strainer.py - is this what you mean?
> > >
> > Yes, strainer_src is now strainphlan_src and metaphlan2_strainer.py is
> now
> > strainphlan.py. As I mentioned before, it is better to use the latest
> > version of the repository now because the tutorial now fits with the new
> > names:
> >
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-metagenomic-strain-level-population-genomics
> >
> > and we may not change them again :).
>
> So *if* you want to let users use the latest state of the repository why
> don't you do a new versioned release to make it official.  Debian uses a
> system to check web pages for versioned releases.  We can not sneak into
> each repository nor wild guessing if it is a stable commit or not.  Is
> there any reason not to release say version 2.6 or 2.5.1 or whatever?
>
> Kind regards
>
>   Andreas.
>
> > > [1] https://bitbucket.org/nsegata/hclust2
>
> --
> http://fam-tille.de
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-12 Thread Andreas Tille
Hi Tin,

On Fri, Aug 12, 2016 at 03:30:25PM +, Duy Tin Truong wrote:
> > However the hint ot hclust2[1] is helpful.  Unfortunately I can not find
> > any description for this software.  Since you might have some influence on
> > this it would be great to provide a hint where I can find a description for
> > a potential package.
> >
> hclust2 is used to plot heat-maps and not directly used in metaphlan2.py or
> strainphlan.py. In other words, metaphlan2 does not depend tightly on this
> tool. However, I will update the wiki page later.

Sounds like a neat tool anyway - so if there is a description I could
provide a package.
 
> > > and here for strainphlan (another brother tool uses the same database
> > with
> > > metaphlan2 and both are in the same repository and should go together,
> > > strainphlan is in strainphlan.py and metaphlan2 is in metaphlan2.py):
> > >
> > https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1
> >
> > Well, the download file for metaphlan2 version 2.5 has strainer_src and
> > metaphlan2_strainer.py - is this what you mean?
> >
> Yes, strainer_src is now strainphlan_src and metaphlan2_strainer.py is now
> strainphlan.py. As I mentioned before, it is better to use the latest
> version of the repository now because the tutorial now fits with the new
> names:
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-metagenomic-strain-level-population-genomics
> 
> and we may not change them again :).

So *if* you want to let users use the latest state of the repository why
don't you do a new versioned release to make it official.  Debian uses a
system to check web pages for versioned releases.  We can not sneak into
each repository nor wild guessing if it is a stable commit or not.  Is
there any reason not to release say version 2.6 or 2.5.1 or whatever?

Kind regards

  Andreas.
 
> > [1] https://bitbucket.org/nsegata/hclust2

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-12 Thread Duy Tin Truong
Hi Andreas,

> However the hint ot hclust2[1] is helpful.  Unfortunately I can not find
> any description for this software.  Since you might have some influence on
> this it would be great to provide a hint where I can find a description for
> a potential package.
>
hclust2 is used to plot heat-maps and not directly used in metaphlan2.py or
strainphlan.py. In other words, metaphlan2 does not depend tightly on this
tool. However, I will update the wiki page later.


> > and here for strainphlan (another brother tool uses the same database
> with
> > metaphlan2 and both are in the same repository and should go together,
> > strainphlan is in strainphlan.py and metaphlan2 is in metaphlan2.py):
> >
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1
>
> Well, the download file for metaphlan2 version 2.5 has strainer_src and
> metaphlan2_strainer.py - is this what you mean?
>
Yes, strainer_src is now strainphlan_src and metaphlan2_strainer.py is now
strainphlan.py. As I mentioned before, it is better to use the latest
version of the repository now because the tutorial now fits with the new
names:
https://bitbucket.org/biobakery/metaphlan2#markdown-header-metagenomic-strain-level-population-genomics

and we may not change them again :).

Thanks,
Tin


> Kind regards
>
> Andreas.
>
> [1] https://bitbucket.org/nsegata/hclust2
>
> >
> > On Fri, Aug 12, 2016 at 1:59 PM Andreas Tille  wrote:
> >
> > > Hi Duy,
> > >
> > > On Thu, Aug 11, 2016 at 05:03:20AM +, Duy Tin Truong wrote:
> > > > We have discussed the plan but unfortunately, we do not have enough
> > > > resource for that task now. In addition, redesigning the source
> structure
> > > > requires us to change all tutorials and that is quite expensive for
> us
> > > and
> > > > users now. We will inform you when we can separate them. Currently,
> the
> > > > tutorial for users is here:
> > > > https://bitbucket.org/biobakery/metaphlan2
> > > >
> > > > and fits well with latest version and it is quite stable now. We
> don't
> > > > think that there will be a substantial update for the source code in
> the
> > > > near feature.
> > >
> > > Thanks for the explanation which helps me to make a sensible decision.
> > > I plan to do the following:
> > >
> > >1. metaphlan2-data
> > >   The source tarball will be created by downloading the original
> > >   tarball from your site, strip the code and convert the data using
> > > bowtie2-build markers.fasta ../db_v20/mpa_v20_m200
> > >   The Debian source tarball created this way will ship the fasta
> > >   version of the data and rebuilds the bowtie2 database at
> > >   installation time on users machine.  I plan to enable the admin
> > >   to opt out from immediate generation and provide a script that
> > >   does the job later.  I also plan to provide md5sums of the data
> > >   to ensure that the resulting database is really identical to the
> > >   metaphlan2 download.
> > >
> > >2. metaphlan2 (the code):
> > >   The source tarball will be created by simply striping the data
> > >   and the binary Debian package as its done currently in my
> > >   packaging code.  The metaphlan2 package will depend from the
> > >   metaphlan2-data package
> > >
> > > This does not require any change at your side but prevents over-large
> > > packages on Debian site.
> > >
> > > Please confirm that this plan sounds sensible to you (or if I was not
> > > explicite enough in my explanation).
> > >
> > > Kind regards
> > >
> > >Andreas.
> > >
> > > --
> > > http://fam-tille.de
> > >
>
> --
> http://fam-tille.de
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-12 Thread Andreas Tille
Hi Tin,

On Fri, Aug 12, 2016 at 01:56:39PM +, Duy Tin Truong wrote:
> Yes, that is fine for me. Just a small comment, metaphlan2-data will depend
> on bowtie2 so that you can have bowtie2 to convert the fasta file.

Yes, that's obvious.

> In addition, this is the list of dependencies for metaphlan2 in case you
> need:
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites

Most of them are available in Debian and the current packaging says:

Depends: python-biom-format,
 python-msgpack,
 python-pandas,
 bowtie2

(while python-biom-format implicitly depends python-numpy and python-scipy)

However the hint ot hclust2[1] is helpful.  Unfortunately I can not find
any description for this software.  Since you might have some influence on
this it would be great to provide a hint where I can find a description for
a potential package.

> and here for strainphlan (another brother tool uses the same database with
> metaphlan2 and both are in the same repository and should go together,
> strainphlan is in strainphlan.py and metaphlan2 is in metaphlan2.py):
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1

Well, the download file for metaphlan2 version 2.5 has strainer_src and
metaphlan2_strainer.py - is this what you mean?

Kind regards

Andreas.

[1] https://bitbucket.org/nsegata/hclust2
 
> 
> On Fri, Aug 12, 2016 at 1:59 PM Andreas Tille  wrote:
> 
> > Hi Duy,
> >
> > On Thu, Aug 11, 2016 at 05:03:20AM +, Duy Tin Truong wrote:
> > > We have discussed the plan but unfortunately, we do not have enough
> > > resource for that task now. In addition, redesigning the source structure
> > > requires us to change all tutorials and that is quite expensive for us
> > and
> > > users now. We will inform you when we can separate them. Currently, the
> > > tutorial for users is here:
> > > https://bitbucket.org/biobakery/metaphlan2
> > >
> > > and fits well with latest version and it is quite stable now. We don't
> > > think that there will be a substantial update for the source code in the
> > > near feature.
> >
> > Thanks for the explanation which helps me to make a sensible decision.
> > I plan to do the following:
> >
> >1. metaphlan2-data
> >   The source tarball will be created by downloading the original
> >   tarball from your site, strip the code and convert the data using
> > bowtie2-build markers.fasta ../db_v20/mpa_v20_m200
> >   The Debian source tarball created this way will ship the fasta
> >   version of the data and rebuilds the bowtie2 database at
> >   installation time on users machine.  I plan to enable the admin
> >   to opt out from immediate generation and provide a script that
> >   does the job later.  I also plan to provide md5sums of the data
> >   to ensure that the resulting database is really identical to the
> >   metaphlan2 download.
> >
> >2. metaphlan2 (the code):
> >   The source tarball will be created by simply striping the data
> >   and the binary Debian package as its done currently in my
> >   packaging code.  The metaphlan2 package will depend from the
> >   metaphlan2-data package
> >
> > This does not require any change at your side but prevents over-large
> > packages on Debian site.
> >
> > Please confirm that this plan sounds sensible to you (or if I was not
> > explicite enough in my explanation).
> >
> > Kind regards
> >
> >Andreas.
> >
> > --
> > http://fam-tille.de
> >

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-12 Thread Duy Tin Truong
Hi Andreas,

Yes, that is fine for me. Just a small comment, metaphlan2-data will depend
on bowtie2 so that you can have bowtie2 to convert the fasta file.
In addition, this is the list of dependencies for metaphlan2 in case you
need:
https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites
and here for strainphlan (another brother tool uses the same database with
metaphlan2 and both are in the same repository and should go together,
strainphlan is in strainphlan.py and metaphlan2 is in metaphlan2.py):
https://bitbucket.org/biobakery/metaphlan2#markdown-header-pre-requisites_1

Thanks,
Tin


On Fri, Aug 12, 2016 at 1:59 PM Andreas Tille  wrote:

> Hi Duy,
>
> On Thu, Aug 11, 2016 at 05:03:20AM +, Duy Tin Truong wrote:
> > We have discussed the plan but unfortunately, we do not have enough
> > resource for that task now. In addition, redesigning the source structure
> > requires us to change all tutorials and that is quite expensive for us
> and
> > users now. We will inform you when we can separate them. Currently, the
> > tutorial for users is here:
> > https://bitbucket.org/biobakery/metaphlan2
> >
> > and fits well with latest version and it is quite stable now. We don't
> > think that there will be a substantial update for the source code in the
> > near feature.
>
> Thanks for the explanation which helps me to make a sensible decision.
> I plan to do the following:
>
>1. metaphlan2-data
>   The source tarball will be created by downloading the original
>   tarball from your site, strip the code and convert the data using
> bowtie2-build markers.fasta ../db_v20/mpa_v20_m200
>   The Debian source tarball created this way will ship the fasta
>   version of the data and rebuilds the bowtie2 database at
>   installation time on users machine.  I plan to enable the admin
>   to opt out from immediate generation and provide a script that
>   does the job later.  I also plan to provide md5sums of the data
>   to ensure that the resulting database is really identical to the
>   metaphlan2 download.
>
>2. metaphlan2 (the code):
>   The source tarball will be created by simply striping the data
>   and the binary Debian package as its done currently in my
>   packaging code.  The metaphlan2 package will depend from the
>   metaphlan2-data package
>
> This does not require any change at your side but prevents over-large
> packages on Debian site.
>
> Please confirm that this plan sounds sensible to you (or if I was not
> explicite enough in my explanation).
>
> Kind regards
>
>Andreas.
>
> --
> http://fam-tille.de
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-12 Thread Andreas Tille
Hi Duy,

On Thu, Aug 11, 2016 at 05:03:20AM +, Duy Tin Truong wrote:
> We have discussed the plan but unfortunately, we do not have enough
> resource for that task now. In addition, redesigning the source structure
> requires us to change all tutorials and that is quite expensive for us and
> users now. We will inform you when we can separate them. Currently, the
> tutorial for users is here:
> https://bitbucket.org/biobakery/metaphlan2
> 
> and fits well with latest version and it is quite stable now. We don't
> think that there will be a substantial update for the source code in the
> near feature.

Thanks for the explanation which helps me to make a sensible decision.
I plan to do the following:

   1. metaphlan2-data
  The source tarball will be created by downloading the original
  tarball from your site, strip the code and convert the data using
bowtie2-build markers.fasta ../db_v20/mpa_v20_m200
  The Debian source tarball created this way will ship the fasta
  version of the data and rebuilds the bowtie2 database at
  installation time on users machine.  I plan to enable the admin
  to opt out from immediate generation and provide a script that
  does the job later.  I also plan to provide md5sums of the data
  to ensure that the resulting database is really identical to the
  metaphlan2 download.

   2. metaphlan2 (the code):
  The source tarball will be created by simply striping the data
  and the binary Debian package as its done currently in my
  packaging code.  The metaphlan2 package will depend from the
  metaphlan2-data package

This does not require any change at your side but prevents over-large
packages on Debian site.

Please confirm that this plan sounds sensible to you (or if I was not
explicite enough in my explanation).

Kind regards

   Andreas.

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-10 Thread Duy Tin Truong
Hi Andreas,

We have discussed the plan but unfortunately, we do not have enough
resource for that task now. In addition, redesigning the source structure
requires us to change all tutorials and that is quite expensive for us and
users now. We will inform you when we can separate them. Currently, the
tutorial for users is here:
https://bitbucket.org/biobakery/metaphlan2

and fits well with latest version and it is quite stable now. We don't
think that there will be a substantial update for the source code in the
near feature.

Thanks,
Tin


On Wed, Aug 10, 2016 at 5:18 PM Andreas Tille  wrote:

> Hi again,
>
> On Sat, Aug 06, 2016 at 01:47:23PM +, Duy Tin Truong wrote:
> > Regarding the separation code and data issue, I will discuss with Nicola
> > next Monday and let you know.
>
> any plan how to provide code and data in the future?
>
> Kind regards
>
>  Andreas.
>
> --
> http://fam-tille.de
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-10 Thread Andreas Tille
Hi again,

On Sat, Aug 06, 2016 at 01:47:23PM +, Duy Tin Truong wrote:
> Regarding the separation code and data issue, I will discuss with Nicola
> next Monday and let you know.

any plan how to provide code and data in the future?

Kind regards

 Andreas.

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-06 Thread Duy Tin Truong
Hi Andreas,

I meant the latest version of the repository fit with the tutorial on the
repository. If you used the older version (old names), I am afraid users
will have some problems when following the tutorial.
Regarding the separation code and data issue, I will discuss with Nicola
next Monday and let you know.

Thanks,
Tin

On Fri, Aug 5, 2016 at 10:21 PM Andreas Tille  wrote:

> Hi Tin,
>
> I need to admit that I can not parse the information you gave in your mail.
>
> It is also not really connected to my next mail (which is archived here
>https://lists.debian.org/debian-med/2016/08/msg00040.html ) about the
> separation of code and data.
>
> Kind regards
>
>   Andreas.
>
> On Thu, Aug 04, 2016 at 11:48:08AM +, Duy Tin Truong wrote:
> > Hi Andreas,
> >
> > If you can use the latest version with the name changes as I mentioned,
> it
> > would fit better with the updated tutorial on the metaphlan2 repository.
> >
> > Thanks,
> > Tin
> >
> > On Thu, Aug 4, 2016 at 1:28 PM Nicola Segata 
> wrote:
> >
> > > Hi Andreas,
> > >  yes, it is likely that the code will be frequently updated, but the
> big
> > > database file will change only rarely (for sure no more frequently than
> > > once a year).
> > > thanks
> > > Nicola
> > >
> > > On Thu, Aug 4, 2016 at 12:46 PM Andreas Tille 
> wrote:
> > >
> > >> Hi again,
> > >>
> > >> On Thu, Aug 04, 2016 at 08:10:29AM +, Nicola Segata wrote:
> > >> > Makes sense to me!
> > >>
> > >>https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=833388#15
> > >>
> > >> If you read the discussion it seems that my suggestion to ship the
> fasta
> > >> file inside the Debian package and let the postinst do the
> > >> transformation step found some agreement - provided that there are no
> > >> frequent changes in the package and several uploads per month will
> > >> happen.
> > >>
> > >> I'm now wondering what your estimated change rate for the metaphlan2
> > >> data files might be.  Do these change frequently?  Is there any chance
> > >> that the code changes frequently but the data files stay unchanged?
> > >>
> > >> Kind regards
> > >>
> > >>   Andreas.
> > >>
> > >> > On Thu, Aug 4, 2016 at 8:18 AM Andreas Tille 
> wrote:
> > >> >
> > >> > > Hi Nicola,
> > >> > >
> > >> > > On Wed, Aug 03, 2016 at 08:51:33PM +, Nicola Segata wrote:
> > >> > > > Great, thanks Andreas. We provide the "*.bt2" files so that the
> > >> user can
> > >> > > > run BowTie2 internally to MetaPhlAn directly without first
> building
> > >> the
> > >> > > > indexes (it will take quite a bit of time).
> > >> > >
> > >> > > Fully agreed here.
> > >> > >
> > >> > > > Also, the indexes are smaller
> > >> > > > in size than the sequence file...
> > >> > >
> > >> > > Hmmm, all *.bt2 files sum up to 1,124,449kB while the fasta file
> has
> > >> > > only 753081kB.  Considering the better compression performance of
> pure
> > >> > > text files a compressed archive containing the fasta is
> drastically
> > >> > > smaller than one with the *.bt2 files.  Yesterday I tried to
> start a
> > >> > > discussion how to deal with the size of the data inside Debian[1]
> (no
> > >> > > answer so far) and my experiment to create a source tarball just
> > >> > > containing the fasta resulted in a 270MB *xz* compressed file
> (well xz
> > >> > > is better than gz but lets say the compressed tarball with the
> fasta
> > >> is
> > >> > > about 30% of size of your current download of 1.017MB.
> > >> > >
> > >> > > The situation for Debian is different than from your users:  A
> user
> > >> who
> > >> > > downloads from your website intends to run metaphlan2.  Amongst
> the
> > >> > > millions of Debian users only very few are interested in
> metaphlan2
> > >> and
> > >> > > we need to outweight how much resources we could spent.  Its not
> that
> > >> > > only Debian provides resources.  There is a large mirroring
> network
> > >> that
> > >> > > spents lots of bandwidth and disk space for a very small usage.
> So in
> > >> > > this case it makes sense to put the effort on the users side to
> > >> > > regenerate the indexes (or even download the data separately via a
> > >> > > script we could provide inside the package).  So I could imagine
> to
> > >> > > package only the metaphlan2 code and provide a script that
> downloads
> > >> the
> > >> > > data and puts them into the expected place.
> > >> > >
> > >> > > Kind regards
> > >> > >
> > >> > >  Andreas.
> > >> > >
> > >> > > [1]
> > >> > >
> > >>
> https://lists.alioth.debian.org/pipermail/debian-med-packaging/2016-August/044984.html
> > >> > >
> > >> > > > cheers
> > >> > > > Nicola
> > >> > > >
> > >> > > > On Wed, Aug 3, 2016 at 6:08 PM Andreas Tille 
> > >> wrote:
> > >> > > >
> > >> > > > > Hi Tin,
> > >> > > > >
> > >> > > > > On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong
> wrote:
> > >> > > > > > > - Tin can also provide more info 

Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-05 Thread Andreas Tille
Hi Tin,

I need to admit that I can not parse the information you gave in your mail.

It is also not really connected to my next mail (which is archived here
   https://lists.debian.org/debian-med/2016/08/msg00040.html ) about the
separation of code and data.

Kind regards

  Andreas.

On Thu, Aug 04, 2016 at 11:48:08AM +, Duy Tin Truong wrote:
> Hi Andreas,
> 
> If you can use the latest version with the name changes as I mentioned, it
> would fit better with the updated tutorial on the metaphlan2 repository.
> 
> Thanks,
> Tin
> 
> On Thu, Aug 4, 2016 at 1:28 PM Nicola Segata  wrote:
> 
> > Hi Andreas,
> >  yes, it is likely that the code will be frequently updated, but the big
> > database file will change only rarely (for sure no more frequently than
> > once a year).
> > thanks
> > Nicola
> >
> > On Thu, Aug 4, 2016 at 12:46 PM Andreas Tille  wrote:
> >
> >> Hi again,
> >>
> >> On Thu, Aug 04, 2016 at 08:10:29AM +, Nicola Segata wrote:
> >> > Makes sense to me!
> >>
> >>https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=833388#15
> >>
> >> If you read the discussion it seems that my suggestion to ship the fasta
> >> file inside the Debian package and let the postinst do the
> >> transformation step found some agreement - provided that there are no
> >> frequent changes in the package and several uploads per month will
> >> happen.
> >>
> >> I'm now wondering what your estimated change rate for the metaphlan2
> >> data files might be.  Do these change frequently?  Is there any chance
> >> that the code changes frequently but the data files stay unchanged?
> >>
> >> Kind regards
> >>
> >>   Andreas.
> >>
> >> > On Thu, Aug 4, 2016 at 8:18 AM Andreas Tille  wrote:
> >> >
> >> > > Hi Nicola,
> >> > >
> >> > > On Wed, Aug 03, 2016 at 08:51:33PM +, Nicola Segata wrote:
> >> > > > Great, thanks Andreas. We provide the "*.bt2" files so that the
> >> user can
> >> > > > run BowTie2 internally to MetaPhlAn directly without first building
> >> the
> >> > > > indexes (it will take quite a bit of time).
> >> > >
> >> > > Fully agreed here.
> >> > >
> >> > > > Also, the indexes are smaller
> >> > > > in size than the sequence file...
> >> > >
> >> > > Hmmm, all *.bt2 files sum up to 1,124,449kB while the fasta file has
> >> > > only 753081kB.  Considering the better compression performance of pure
> >> > > text files a compressed archive containing the fasta is drastically
> >> > > smaller than one with the *.bt2 files.  Yesterday I tried to start a
> >> > > discussion how to deal with the size of the data inside Debian[1] (no
> >> > > answer so far) and my experiment to create a source tarball just
> >> > > containing the fasta resulted in a 270MB *xz* compressed file (well xz
> >> > > is better than gz but lets say the compressed tarball with the fasta
> >> is
> >> > > about 30% of size of your current download of 1.017MB.
> >> > >
> >> > > The situation for Debian is different than from your users:  A user
> >> who
> >> > > downloads from your website intends to run metaphlan2.  Amongst the
> >> > > millions of Debian users only very few are interested in metaphlan2
> >> and
> >> > > we need to outweight how much resources we could spent.  Its not that
> >> > > only Debian provides resources.  There is a large mirroring network
> >> that
> >> > > spents lots of bandwidth and disk space for a very small usage.  So in
> >> > > this case it makes sense to put the effort on the users side to
> >> > > regenerate the indexes (or even download the data separately via a
> >> > > script we could provide inside the package).  So I could imagine to
> >> > > package only the metaphlan2 code and provide a script that downloads
> >> the
> >> > > data and puts them into the expected place.
> >> > >
> >> > > Kind regards
> >> > >
> >> > >  Andreas.
> >> > >
> >> > > [1]
> >> > >
> >> https://lists.alioth.debian.org/pipermail/debian-med-packaging/2016-August/044984.html
> >> > >
> >> > > > cheers
> >> > > > Nicola
> >> > > >
> >> > > > On Wed, Aug 3, 2016 at 6:08 PM Andreas Tille 
> >> wrote:
> >> > > >
> >> > > > > Hi Tin,
> >> > > > >
> >> > > > > On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong wrote:
> >> > > > > > > - Tin can also provide more info about the binary data in
> >> db_v20.
> >> > > The
> >> > > > > files
> >> > > > > > > ending with "bt2" are created using a script in the Bowtie2
> >> package
> >> > > > > > > (bowtie2-build) using a sequence file Tin can provide (it can
> >> also
> >> > > be
> >> > > > > > > recovered from the bt2 files with bowtie2-inspect if I
> >> remember
> >> > > well).
> >> > > > > > As Nicola said, those files in db_v20 are created with
> >> bowtie2-build
> >> > > > > > using a sequence file and you can recover the sequence file by:
> >> > > > > >
> >> > > > > > bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 >
> >> > > metaphlan2/markers.fasta
> >> > > > > >
> >> > > > > 

Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-04 Thread Duy Tin Truong
Hi Andreas,

If you can use the latest version with the name changes as I mentioned, it
would fit better with the updated tutorial on the metaphlan2 repository.

Thanks,
Tin

On Thu, Aug 4, 2016 at 1:28 PM Nicola Segata  wrote:

> Hi Andreas,
>  yes, it is likely that the code will be frequently updated, but the big
> database file will change only rarely (for sure no more frequently than
> once a year).
> thanks
> Nicola
>
> On Thu, Aug 4, 2016 at 12:46 PM Andreas Tille  wrote:
>
>> Hi again,
>>
>> On Thu, Aug 04, 2016 at 08:10:29AM +, Nicola Segata wrote:
>> > Makes sense to me!
>>
>>https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=833388#15
>>
>> If you read the discussion it seems that my suggestion to ship the fasta
>> file inside the Debian package and let the postinst do the
>> transformation step found some agreement - provided that there are no
>> frequent changes in the package and several uploads per month will
>> happen.
>>
>> I'm now wondering what your estimated change rate for the metaphlan2
>> data files might be.  Do these change frequently?  Is there any chance
>> that the code changes frequently but the data files stay unchanged?
>>
>> Kind regards
>>
>>   Andreas.
>>
>> > On Thu, Aug 4, 2016 at 8:18 AM Andreas Tille  wrote:
>> >
>> > > Hi Nicola,
>> > >
>> > > On Wed, Aug 03, 2016 at 08:51:33PM +, Nicola Segata wrote:
>> > > > Great, thanks Andreas. We provide the "*.bt2" files so that the
>> user can
>> > > > run BowTie2 internally to MetaPhlAn directly without first building
>> the
>> > > > indexes (it will take quite a bit of time).
>> > >
>> > > Fully agreed here.
>> > >
>> > > > Also, the indexes are smaller
>> > > > in size than the sequence file...
>> > >
>> > > Hmmm, all *.bt2 files sum up to 1,124,449kB while the fasta file has
>> > > only 753081kB.  Considering the better compression performance of pure
>> > > text files a compressed archive containing the fasta is drastically
>> > > smaller than one with the *.bt2 files.  Yesterday I tried to start a
>> > > discussion how to deal with the size of the data inside Debian[1] (no
>> > > answer so far) and my experiment to create a source tarball just
>> > > containing the fasta resulted in a 270MB *xz* compressed file (well xz
>> > > is better than gz but lets say the compressed tarball with the fasta
>> is
>> > > about 30% of size of your current download of 1.017MB.
>> > >
>> > > The situation for Debian is different than from your users:  A user
>> who
>> > > downloads from your website intends to run metaphlan2.  Amongst the
>> > > millions of Debian users only very few are interested in metaphlan2
>> and
>> > > we need to outweight how much resources we could spent.  Its not that
>> > > only Debian provides resources.  There is a large mirroring network
>> that
>> > > spents lots of bandwidth and disk space for a very small usage.  So in
>> > > this case it makes sense to put the effort on the users side to
>> > > regenerate the indexes (or even download the data separately via a
>> > > script we could provide inside the package).  So I could imagine to
>> > > package only the metaphlan2 code and provide a script that downloads
>> the
>> > > data and puts them into the expected place.
>> > >
>> > > Kind regards
>> > >
>> > >  Andreas.
>> > >
>> > > [1]
>> > >
>> https://lists.alioth.debian.org/pipermail/debian-med-packaging/2016-August/044984.html
>> > >
>> > > > cheers
>> > > > Nicola
>> > > >
>> > > > On Wed, Aug 3, 2016 at 6:08 PM Andreas Tille 
>> wrote:
>> > > >
>> > > > > Hi Tin,
>> > > > >
>> > > > > On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong wrote:
>> > > > > > > - Tin can also provide more info about the binary data in
>> db_v20.
>> > > The
>> > > > > files
>> > > > > > > ending with "bt2" are created using a script in the Bowtie2
>> package
>> > > > > > > (bowtie2-build) using a sequence file Tin can provide (it can
>> also
>> > > be
>> > > > > > > recovered from the bt2 files with bowtie2-inspect if I
>> remember
>> > > well).
>> > > > > > As Nicola said, those files in db_v20 are created with
>> bowtie2-build
>> > > > > > using a sequence file and you can recover the sequence file by:
>> > > > > >
>> > > > > > bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 >
>> > > metaphlan2/markers.fasta
>> > > > > >
>> > > > > > If you want to rebuild them, the command is:
>> > > > > >
>> > > > > > bowtie2-build metaphlan2/markers.fasta
>> metaphlan2/db_v21/mpa_v21_m200
>> > > > >
>> > > > > I can confirm that I can reproduce the files byte identical from
>> > > > > markers.fasta.  Is there any reason to ship the binary form
>> instead of
>> > > > > the fasta text file?  Moreover, what is the source of the
>> > > markers.fasta?
>> > > > > Is there any related publication or so?
>> > > > >
>> > > > > > > For the mpa_v20_m200.pkl Tin can also provide the uncompressed
>> > > python
>> > > > > > > object (or he can 

Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-04 Thread Nicola Segata
Hi Andreas,
 yes, it is likely that the code will be frequently updated, but the big
database file will change only rarely (for sure no more frequently than
once a year).
thanks
Nicola

On Thu, Aug 4, 2016 at 12:46 PM Andreas Tille  wrote:

> Hi again,
>
> On Thu, Aug 04, 2016 at 08:10:29AM +, Nicola Segata wrote:
> > Makes sense to me!
>
>https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=833388#15
>
> If you read the discussion it seems that my suggestion to ship the fasta
> file inside the Debian package and let the postinst do the
> transformation step found some agreement - provided that there are no
> frequent changes in the package and several uploads per month will
> happen.
>
> I'm now wondering what your estimated change rate for the metaphlan2
> data files might be.  Do these change frequently?  Is there any chance
> that the code changes frequently but the data files stay unchanged?
>
> Kind regards
>
>   Andreas.
>
> > On Thu, Aug 4, 2016 at 8:18 AM Andreas Tille  wrote:
> >
> > > Hi Nicola,
> > >
> > > On Wed, Aug 03, 2016 at 08:51:33PM +, Nicola Segata wrote:
> > > > Great, thanks Andreas. We provide the "*.bt2" files so that the user
> can
> > > > run BowTie2 internally to MetaPhlAn directly without first building
> the
> > > > indexes (it will take quite a bit of time).
> > >
> > > Fully agreed here.
> > >
> > > > Also, the indexes are smaller
> > > > in size than the sequence file...
> > >
> > > Hmmm, all *.bt2 files sum up to 1,124,449kB while the fasta file has
> > > only 753081kB.  Considering the better compression performance of pure
> > > text files a compressed archive containing the fasta is drastically
> > > smaller than one with the *.bt2 files.  Yesterday I tried to start a
> > > discussion how to deal with the size of the data inside Debian[1] (no
> > > answer so far) and my experiment to create a source tarball just
> > > containing the fasta resulted in a 270MB *xz* compressed file (well xz
> > > is better than gz but lets say the compressed tarball with the fasta is
> > > about 30% of size of your current download of 1.017MB.
> > >
> > > The situation for Debian is different than from your users:  A user who
> > > downloads from your website intends to run metaphlan2.  Amongst the
> > > millions of Debian users only very few are interested in metaphlan2 and
> > > we need to outweight how much resources we could spent.  Its not that
> > > only Debian provides resources.  There is a large mirroring network
> that
> > > spents lots of bandwidth and disk space for a very small usage.  So in
> > > this case it makes sense to put the effort on the users side to
> > > regenerate the indexes (or even download the data separately via a
> > > script we could provide inside the package).  So I could imagine to
> > > package only the metaphlan2 code and provide a script that downloads
> the
> > > data and puts them into the expected place.
> > >
> > > Kind regards
> > >
> > >  Andreas.
> > >
> > > [1]
> > >
> https://lists.alioth.debian.org/pipermail/debian-med-packaging/2016-August/044984.html
> > >
> > > > cheers
> > > > Nicola
> > > >
> > > > On Wed, Aug 3, 2016 at 6:08 PM Andreas Tille 
> wrote:
> > > >
> > > > > Hi Tin,
> > > > >
> > > > > On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong wrote:
> > > > > > > - Tin can also provide more info about the binary data in
> db_v20.
> > > The
> > > > > files
> > > > > > > ending with "bt2" are created using a script in the Bowtie2
> package
> > > > > > > (bowtie2-build) using a sequence file Tin can provide (it can
> also
> > > be
> > > > > > > recovered from the bt2 files with bowtie2-inspect if I remember
> > > well).
> > > > > > As Nicola said, those files in db_v20 are created with
> bowtie2-build
> > > > > > using a sequence file and you can recover the sequence file by:
> > > > > >
> > > > > > bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 >
> > > metaphlan2/markers.fasta
> > > > > >
> > > > > > If you want to rebuild them, the command is:
> > > > > >
> > > > > > bowtie2-build metaphlan2/markers.fasta
> metaphlan2/db_v21/mpa_v21_m200
> > > > >
> > > > > I can confirm that I can reproduce the files byte identical from
> > > > > markers.fasta.  Is there any reason to ship the binary form
> instead of
> > > > > the fasta text file?  Moreover, what is the source of the
> > > markers.fasta?
> > > > > Is there any related publication or so?
> > > > >
> > > > > > > For the mpa_v20_m200.pkl Tin can also provide the uncompressed
> > > python
> > > > > > > object (or he can provide a couple of lines of code to
> uncompress
> > > it?)
> > > > > > It is python dictionary and can be read as:
> > > > > >
> > > > > > import cPickle as pickleimport bz2
> > > > > > db = pickle.load(bz2.BZ2File('db_v20/mpa_v20_m200.pkl', 'r'))
> > > > > >
> > > > > > You can have more information about them at:
> > > > > >
> > > > >
> > >
> 

Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-04 Thread Andreas Tille
Hi again,

On Thu, Aug 04, 2016 at 08:10:29AM +, Nicola Segata wrote:
> Makes sense to me!

   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=833388#15

If you read the discussion it seems that my suggestion to ship the fasta
file inside the Debian package and let the postinst do the
transformation step found some agreement - provided that there are no
frequent changes in the package and several uploads per month will
happen.

I'm now wondering what your estimated change rate for the metaphlan2
data files might be.  Do these change frequently?  Is there any chance
that the code changes frequently but the data files stay unchanged?

Kind regards

  Andreas.
 
> On Thu, Aug 4, 2016 at 8:18 AM Andreas Tille  wrote:
> 
> > Hi Nicola,
> >
> > On Wed, Aug 03, 2016 at 08:51:33PM +, Nicola Segata wrote:
> > > Great, thanks Andreas. We provide the "*.bt2" files so that the user can
> > > run BowTie2 internally to MetaPhlAn directly without first building the
> > > indexes (it will take quite a bit of time).
> >
> > Fully agreed here.
> >
> > > Also, the indexes are smaller
> > > in size than the sequence file...
> >
> > Hmmm, all *.bt2 files sum up to 1,124,449kB while the fasta file has
> > only 753081kB.  Considering the better compression performance of pure
> > text files a compressed archive containing the fasta is drastically
> > smaller than one with the *.bt2 files.  Yesterday I tried to start a
> > discussion how to deal with the size of the data inside Debian[1] (no
> > answer so far) and my experiment to create a source tarball just
> > containing the fasta resulted in a 270MB *xz* compressed file (well xz
> > is better than gz but lets say the compressed tarball with the fasta is
> > about 30% of size of your current download of 1.017MB.
> >
> > The situation for Debian is different than from your users:  A user who
> > downloads from your website intends to run metaphlan2.  Amongst the
> > millions of Debian users only very few are interested in metaphlan2 and
> > we need to outweight how much resources we could spent.  Its not that
> > only Debian provides resources.  There is a large mirroring network that
> > spents lots of bandwidth and disk space for a very small usage.  So in
> > this case it makes sense to put the effort on the users side to
> > regenerate the indexes (or even download the data separately via a
> > script we could provide inside the package).  So I could imagine to
> > package only the metaphlan2 code and provide a script that downloads the
> > data and puts them into the expected place.
> >
> > Kind regards
> >
> >  Andreas.
> >
> > [1]
> > https://lists.alioth.debian.org/pipermail/debian-med-packaging/2016-August/044984.html
> >
> > > cheers
> > > Nicola
> > >
> > > On Wed, Aug 3, 2016 at 6:08 PM Andreas Tille  wrote:
> > >
> > > > Hi Tin,
> > > >
> > > > On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong wrote:
> > > > > > - Tin can also provide more info about the binary data in db_v20.
> > The
> > > > files
> > > > > > ending with "bt2" are created using a script in the Bowtie2 package
> > > > > > (bowtie2-build) using a sequence file Tin can provide (it can also
> > be
> > > > > > recovered from the bt2 files with bowtie2-inspect if I remember
> > well).
> > > > > As Nicola said, those files in db_v20 are created with bowtie2-build
> > > > > using a sequence file and you can recover the sequence file by:
> > > > >
> > > > > bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 >
> > metaphlan2/markers.fasta
> > > > >
> > > > > If you want to rebuild them, the command is:
> > > > >
> > > > > bowtie2-build metaphlan2/markers.fasta metaphlan2/db_v21/mpa_v21_m200
> > > >
> > > > I can confirm that I can reproduce the files byte identical from
> > > > markers.fasta.  Is there any reason to ship the binary form instead of
> > > > the fasta text file?  Moreover, what is the source of the
> > markers.fasta?
> > > > Is there any related publication or so?
> > > >
> > > > > > For the mpa_v20_m200.pkl Tin can also provide the uncompressed
> > python
> > > > > > object (or he can provide a couple of lines of code to uncompress
> > it?)
> > > > > It is python dictionary and can be read as:
> > > > >
> > > > > import cPickle as pickleimport bz2
> > > > > db = pickle.load(bz2.BZ2File('db_v20/mpa_v20_m200.pkl', 'r'))
> > > > >
> > > > > You can have more information about them at:
> > > > >
> > > >
> > https://bitbucket.org/biobakery/metaphlan2#markdown-header-customizing-the-database
> > > >
> > > > OK, that page clarifies the method.  Just a personal remark from the
> > > > point of view of an outsider of bioinformatics:  I'd regard the
> > creation
> > > > process of the mpa_v20_m200.pkl file a bit cumbersome.  I'd personally
> > > > prefer droping some text record somewhere and call a script processing
> > > > this record rather than writing an own script.
> > > >
> > > > > In addition, some files were changed the names:
> > > 

Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-04 Thread Nicola Segata
Makes sense to me!
thanks
Nicola

On Thu, Aug 4, 2016 at 8:18 AM Andreas Tille  wrote:

> Hi Nicola,
>
> On Wed, Aug 03, 2016 at 08:51:33PM +, Nicola Segata wrote:
> > Great, thanks Andreas. We provide the "*.bt2" files so that the user can
> > run BowTie2 internally to MetaPhlAn directly without first building the
> > indexes (it will take quite a bit of time).
>
> Fully agreed here.
>
> > Also, the indexes are smaller
> > in size than the sequence file...
>
> Hmmm, all *.bt2 files sum up to 1,124,449kB while the fasta file has
> only 753081kB.  Considering the better compression performance of pure
> text files a compressed archive containing the fasta is drastically
> smaller than one with the *.bt2 files.  Yesterday I tried to start a
> discussion how to deal with the size of the data inside Debian[1] (no
> answer so far) and my experiment to create a source tarball just
> containing the fasta resulted in a 270MB *xz* compressed file (well xz
> is better than gz but lets say the compressed tarball with the fasta is
> about 30% of size of your current download of 1.017MB.
>
> The situation for Debian is different than from your users:  A user who
> downloads from your website intends to run metaphlan2.  Amongst the
> millions of Debian users only very few are interested in metaphlan2 and
> we need to outweight how much resources we could spent.  Its not that
> only Debian provides resources.  There is a large mirroring network that
> spents lots of bandwidth and disk space for a very small usage.  So in
> this case it makes sense to put the effort on the users side to
> regenerate the indexes (or even download the data separately via a
> script we could provide inside the package).  So I could imagine to
> package only the metaphlan2 code and provide a script that downloads the
> data and puts them into the expected place.
>
> Kind regards
>
>  Andreas.
>
> [1]
> https://lists.alioth.debian.org/pipermail/debian-med-packaging/2016-August/044984.html
>
> > cheers
> > Nicola
> >
> > On Wed, Aug 3, 2016 at 6:08 PM Andreas Tille  wrote:
> >
> > > Hi Tin,
> > >
> > > On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong wrote:
> > > > > - Tin can also provide more info about the binary data in db_v20.
> The
> > > files
> > > > > ending with "bt2" are created using a script in the Bowtie2 package
> > > > > (bowtie2-build) using a sequence file Tin can provide (it can also
> be
> > > > > recovered from the bt2 files with bowtie2-inspect if I remember
> well).
> > > > As Nicola said, those files in db_v20 are created with bowtie2-build
> > > > using a sequence file and you can recover the sequence file by:
> > > >
> > > > bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 >
> metaphlan2/markers.fasta
> > > >
> > > > If you want to rebuild them, the command is:
> > > >
> > > > bowtie2-build metaphlan2/markers.fasta metaphlan2/db_v21/mpa_v21_m200
> > >
> > > I can confirm that I can reproduce the files byte identical from
> > > markers.fasta.  Is there any reason to ship the binary form instead of
> > > the fasta text file?  Moreover, what is the source of the
> markers.fasta?
> > > Is there any related publication or so?
> > >
> > > > > For the mpa_v20_m200.pkl Tin can also provide the uncompressed
> python
> > > > > object (or he can provide a couple of lines of code to uncompress
> it?)
> > > > It is python dictionary and can be read as:
> > > >
> > > > import cPickle as pickleimport bz2
> > > > db = pickle.load(bz2.BZ2File('db_v20/mpa_v20_m200.pkl', 'r'))
> > > >
> > > > You can have more information about them at:
> > > >
> > >
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-customizing-the-database
> > >
> > > OK, that page clarifies the method.  Just a personal remark from the
> > > point of view of an outsider of bioinformatics:  I'd regard the
> creation
> > > process of the mpa_v20_m200.pkl file a bit cumbersome.  I'd personally
> > > prefer droping some text record somewhere and call a script processing
> > > this record rather than writing an own script.
> > >
> > > > In addition, some files were changed the names:
> > > >- metaphlan2_strainer.py -> strainphlan.py
> > > >- strainer_src -> strainphlan_src
> > > >- strainer_tutorial -> strainphlan_tutorial
> > > >
> > > > Some source files were updated as well.
> > > > Please let me know if you need other information.
> > >
> > > Just drop me a not once you might release a new version containing
> these
> > > changes.  I think I'll try to release the current version as is since
> at
> > > least the origin of the files is clarified now.  I'm not yet sure
> whether
> > > the size of the data is acceptable or might spoil some limit.
> Regarding
> > > this I'm wondering whether I create a source tarball including rather
> > > markers.fasta and create the bt2 files in the build process.
> > >
> > > Kind regards
> > >
> > >Andreas.
> > >
> > > --
> > > http://fam-tille.de
> > >
>

Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-04 Thread Andreas Tille
Hi Nicola,

On Wed, Aug 03, 2016 at 08:51:33PM +, Nicola Segata wrote:
> Great, thanks Andreas. We provide the "*.bt2" files so that the user can
> run BowTie2 internally to MetaPhlAn directly without first building the
> indexes (it will take quite a bit of time).

Fully agreed here.

> Also, the indexes are smaller
> in size than the sequence file...

Hmmm, all *.bt2 files sum up to 1,124,449kB while the fasta file has
only 753081kB.  Considering the better compression performance of pure
text files a compressed archive containing the fasta is drastically
smaller than one with the *.bt2 files.  Yesterday I tried to start a
discussion how to deal with the size of the data inside Debian[1] (no
answer so far) and my experiment to create a source tarball just
containing the fasta resulted in a 270MB *xz* compressed file (well xz
is better than gz but lets say the compressed tarball with the fasta is
about 30% of size of your current download of 1.017MB.

The situation for Debian is different than from your users:  A user who
downloads from your website intends to run metaphlan2.  Amongst the
millions of Debian users only very few are interested in metaphlan2 and
we need to outweight how much resources we could spent.  Its not that
only Debian provides resources.  There is a large mirroring network that
spents lots of bandwidth and disk space for a very small usage.  So in
this case it makes sense to put the effort on the users side to
regenerate the indexes (or even download the data separately via a
script we could provide inside the package).  So I could imagine to
package only the metaphlan2 code and provide a script that downloads the
data and puts them into the expected place.

Kind regards

 Andreas.

[1] 
https://lists.alioth.debian.org/pipermail/debian-med-packaging/2016-August/044984.html

> cheers
> Nicola
> 
> On Wed, Aug 3, 2016 at 6:08 PM Andreas Tille  wrote:
> 
> > Hi Tin,
> >
> > On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong wrote:
> > > > - Tin can also provide more info about the binary data in db_v20. The
> > files
> > > > ending with "bt2" are created using a script in the Bowtie2 package
> > > > (bowtie2-build) using a sequence file Tin can provide (it can also be
> > > > recovered from the bt2 files with bowtie2-inspect if I remember well).
> > > As Nicola said, those files in db_v20 are created with bowtie2-build
> > > using a sequence file and you can recover the sequence file by:
> > >
> > > bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 > metaphlan2/markers.fasta
> > >
> > > If you want to rebuild them, the command is:
> > >
> > > bowtie2-build metaphlan2/markers.fasta metaphlan2/db_v21/mpa_v21_m200
> >
> > I can confirm that I can reproduce the files byte identical from
> > markers.fasta.  Is there any reason to ship the binary form instead of
> > the fasta text file?  Moreover, what is the source of the markers.fasta?
> > Is there any related publication or so?
> >
> > > > For the mpa_v20_m200.pkl Tin can also provide the uncompressed python
> > > > object (or he can provide a couple of lines of code to uncompress it?)
> > > It is python dictionary and can be read as:
> > >
> > > import cPickle as pickleimport bz2
> > > db = pickle.load(bz2.BZ2File('db_v20/mpa_v20_m200.pkl', 'r'))
> > >
> > > You can have more information about them at:
> > >
> > https://bitbucket.org/biobakery/metaphlan2#markdown-header-customizing-the-database
> >
> > OK, that page clarifies the method.  Just a personal remark from the
> > point of view of an outsider of bioinformatics:  I'd regard the creation
> > process of the mpa_v20_m200.pkl file a bit cumbersome.  I'd personally
> > prefer droping some text record somewhere and call a script processing
> > this record rather than writing an own script.
> >
> > > In addition, some files were changed the names:
> > >- metaphlan2_strainer.py -> strainphlan.py
> > >- strainer_src -> strainphlan_src
> > >- strainer_tutorial -> strainphlan_tutorial
> > >
> > > Some source files were updated as well.
> > > Please let me know if you need other information.
> >
> > Just drop me a not once you might release a new version containing these
> > changes.  I think I'll try to release the current version as is since at
> > least the origin of the files is clarified now.  I'm not yet sure whether
> > the size of the data is acceptable or might spoil some limit.  Regarding
> > this I'm wondering whether I create a source tarball including rather
> > markers.fasta and create the bt2 files in the build process.
> >
> > Kind regards
> >
> >Andreas.
> >
> > --
> > http://fam-tille.de
> >

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-03 Thread Nicola Segata
Great, thanks Andreas. We provide the "*.bt2" files so that the user can
run BowTie2 internally to MetaPhlAn directly without first building the
indexes (it will take quite a bit of time). Also, the indexes are smaller
in size than the sequence file...

cheers
Nicola

On Wed, Aug 3, 2016 at 6:08 PM Andreas Tille  wrote:

> Hi Tin,
>
> On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong wrote:
> > > - Tin can also provide more info about the binary data in db_v20. The
> files
> > > ending with "bt2" are created using a script in the Bowtie2 package
> > > (bowtie2-build) using a sequence file Tin can provide (it can also be
> > > recovered from the bt2 files with bowtie2-inspect if I remember well).
> > As Nicola said, those files in db_v20 are created with bowtie2-build
> > using a sequence file and you can recover the sequence file by:
> >
> > bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 > metaphlan2/markers.fasta
> >
> > If you want to rebuild them, the command is:
> >
> > bowtie2-build metaphlan2/markers.fasta metaphlan2/db_v21/mpa_v21_m200
>
> I can confirm that I can reproduce the files byte identical from
> markers.fasta.  Is there any reason to ship the binary form instead of
> the fasta text file?  Moreover, what is the source of the markers.fasta?
> Is there any related publication or so?
>
> > > For the mpa_v20_m200.pkl Tin can also provide the uncompressed python
> > > object (or he can provide a couple of lines of code to uncompress it?)
> > It is python dictionary and can be read as:
> >
> > import cPickle as pickleimport bz2
> > db = pickle.load(bz2.BZ2File('db_v20/mpa_v20_m200.pkl', 'r'))
> >
> > You can have more information about them at:
> >
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-customizing-the-database
>
> OK, that page clarifies the method.  Just a personal remark from the
> point of view of an outsider of bioinformatics:  I'd regard the creation
> process of the mpa_v20_m200.pkl file a bit cumbersome.  I'd personally
> prefer droping some text record somewhere and call a script processing
> this record rather than writing an own script.
>
> > In addition, some files were changed the names:
> >- metaphlan2_strainer.py -> strainphlan.py
> >- strainer_src -> strainphlan_src
> >- strainer_tutorial -> strainphlan_tutorial
> >
> > Some source files were updated as well.
> > Please let me know if you need other information.
>
> Just drop me a not once you might release a new version containing these
> changes.  I think I'll try to release the current version as is since at
> least the origin of the files is clarified now.  I'm not yet sure whether
> the size of the data is acceptable or might spoil some limit.  Regarding
> this I'm wondering whether I create a source tarball including rather
> markers.fasta and create the bt2 files in the build process.
>
> Kind regards
>
>Andreas.
>
> --
> http://fam-tille.de
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-03 Thread Andreas Tille
Hi Tin,

On Wed, Aug 03, 2016 at 02:01:01PM +, Duy Tin Truong wrote:
> > - Tin can also provide more info about the binary data in db_v20. The files
> > ending with "bt2" are created using a script in the Bowtie2 package
> > (bowtie2-build) using a sequence file Tin can provide (it can also be
> > recovered from the bt2 files with bowtie2-inspect if I remember well).
> As Nicola said, those files in db_v20 are created with bowtie2-build
> using a sequence file and you can recover the sequence file by:
> 
> bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 > metaphlan2/markers.fasta
> 
> If you want to rebuild them, the command is:
> 
> bowtie2-build metaphlan2/markers.fasta metaphlan2/db_v21/mpa_v21_m200

I can confirm that I can reproduce the files byte identical from
markers.fasta.  Is there any reason to ship the binary form instead of
the fasta text file?  Moreover, what is the source of the markers.fasta?
Is there any related publication or so?

> > For the mpa_v20_m200.pkl Tin can also provide the uncompressed python
> > object (or he can provide a couple of lines of code to uncompress it?)
> It is python dictionary and can be read as:
> 
> import cPickle as pickleimport bz2
> db = pickle.load(bz2.BZ2File('db_v20/mpa_v20_m200.pkl', 'r'))
>
> You can have more information about them at:
> https://bitbucket.org/biobakery/metaphlan2#markdown-header-customizing-the-database

OK, that page clarifies the method.  Just a personal remark from the
point of view of an outsider of bioinformatics:  I'd regard the creation
process of the mpa_v20_m200.pkl file a bit cumbersome.  I'd personally
prefer droping some text record somewhere and call a script processing
this record rather than writing an own script.
 
> In addition, some files were changed the names:
>- metaphlan2_strainer.py -> strainphlan.py
>- strainer_src -> strainphlan_src
>- strainer_tutorial -> strainphlan_tutorial
> 
> Some source files were updated as well.
> Please let me know if you need other information.

Just drop me a not once you might release a new version containing these
changes.  I think I'll try to release the current version as is since at
least the origin of the files is clarified now.  I'm not yet sure whether
the size of the data is acceptable or might spoil some limit.  Regarding
this I'm wondering whether I create a source tarball including rather
markers.fasta and create the bt2 files in the build process.

Kind regards

   Andreas. 

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-03 Thread Duy Tin Truong
Hi Andreas,

Thanks for you work. I answer your questions as bellow:

- some small fixes:
https://anonscm.debian.org/viewvc/debian-med/trunk/packages/metaphlan2/trunk/debian/patches/fix_sequence.patch?view=markup

-> fixed
- some spelling issues https://anonscm.debian.
org/viewvc/debian-med/trunk/packages/metaphlan2/trunk/debian/patches/spelling.patch?view=markup

- Tin can also provide more info about the binary data in db_v20. The files
ending with "bt2" are created using a script in the Bowtie2 package
(bowtie2-build) using a sequence file Tin can provide (it can also be
recovered from the bt2 files with bowtie2-inspect if I remember well).
  As Nicola said, those files in db_v20 are created with bowtie2-build
using a sequence file and you can recover the sequence file by:

bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 > metaphlan2/markers.fasta

If you want to rebuild them, the command is:

bowtie2-build metaphlan2/markers.fasta metaphlan2/db_v21/mpa_v21_m200


- For the mpa_v20_m200.pkl Tin can also provide the uncompressed python
object (or he can provide a couple of lines of code to uncompress it?)
   It is python dictionary and can be read as:

import cPickle as pickleimport bz2
db = pickle.load(bz2.BZ2File('db_v20/mpa_v20_m200.pkl', 'r'))


You can have more information about them at:
https://bitbucket.org/biobakery/metaphlan2#markdown-header-customizing-the-database

In addition, some files were changed the names:
   - metaphlan2_strainer.py -> strainphlan.py
   - strainer_src -> strainphlan_src
   - strainer_tutorial -> strainphlan_tutorial

Some source files were updated as well.
Please let me know if you need other information.

Thanks,
Tin

On Wed, Aug 3, 2016 at 3:38 PM Andreas Tille  wrote:

> Hi Nicola,
>
> thanks for your answer.
>
> On Tue, Aug 02, 2016 at 04:32:31PM +, Nicola Segata wrote:
> > Hi Andreas,
> >  sorry for the delay in replying. I did get your last two emails but it
> > seems the fist one (On Mon, Jul 25, 2016 at 09:45:57PM) never arrived.
>
> Hmmm, sad that there seems to be some mail loss.
>
> > Tin can also provide more info about the binary data in db_v20. The files
> > ending with "bt2" are created using a script in the Bowtie2 package
> > (bowtie2-build) using a sequence file Tin can provide (it can also be
> > recovered from the bt2 files with bowtie2-inspect if I remember well).
> >
> > For the mpa_v20_m200.pkl Tin can also provide the uncompressed python
> > object (or he can provide a couple of lines of code to uncompress it?)
>
> Anything that qualifies as source would be really welcome.  If the
> generation of the binary from this source does not make a big effort (in
> terms of "takes way longer than 1 hour on a decent build machine")
> generating the binaries would be really prefered.
>
> > For the LEfSe package I just added the license in the bitbucket
> repository.
> > For the description, I think you can use the following page:
> > https://bitbucket.org/biobakery/biobakery/wiki/lefse
> > Does it sound like an appropriate description for the package?
>
> I found this after I've sent my mails - thanks for confirming that this
> is the correct description.  I've just uploaded the package to the
> Debian new queue.
>
> > Let me know if you have other questions or if I missed answering to other
> > emails.
>
> If Tin will answer the binary data issue above I have no further
> questions and do not remember any unanswered e-mails.
>
> > thanks so much for your work!
>
> You are welcome
>
>   Andreas.
>
> --
> http://fam-tille.de
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-03 Thread Andreas Tille
Hi Nicola,

thanks for your answer.

On Tue, Aug 02, 2016 at 04:32:31PM +, Nicola Segata wrote:
> Hi Andreas,
>  sorry for the delay in replying. I did get your last two emails but it
> seems the fist one (On Mon, Jul 25, 2016 at 09:45:57PM) never arrived.

Hmmm, sad that there seems to be some mail loss.
 
> Tin can also provide more info about the binary data in db_v20. The files
> ending with "bt2" are created using a script in the Bowtie2 package
> (bowtie2-build) using a sequence file Tin can provide (it can also be
> recovered from the bt2 files with bowtie2-inspect if I remember well).
> 
> For the mpa_v20_m200.pkl Tin can also provide the uncompressed python
> object (or he can provide a couple of lines of code to uncompress it?)

Anything that qualifies as source would be really welcome.  If the
generation of the binary from this source does not make a big effort (in
terms of "takes way longer than 1 hour on a decent build machine")
generating the binaries would be really prefered.
 
> For the LEfSe package I just added the license in the bitbucket repository.
> For the description, I think you can use the following page:
> https://bitbucket.org/biobakery/biobakery/wiki/lefse
> Does it sound like an appropriate description for the package?

I found this after I've sent my mails - thanks for confirming that this
is the correct description.  I've just uploaded the package to the
Debian new queue.
 
> Let me know if you have other questions or if I missed answering to other
> emails.

If Tin will answer the binary data issue above I have no further
questions and do not remember any unanswered e-mails.
 
> thanks so much for your work!

You are welcome

  Andreas.

-- 
http://fam-tille.de



Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-08-02 Thread Nicola Segata
Hi Andreas,
 sorry for the delay in replying. I did get your last two emails but it
seems the fist one (On Mon, Jul 25, 2016 at 09:45:57PM) never arrived.

Thanks so much for the suggestions. Tin, can you take a look at these two
sets of suggestions and update the source code of MetaPhlAn2 accordingly?
some small fixes:
https://anonscm.debian.org/viewvc/debian-med/trunk/packages/metaphlan2/trunk/debian/patches/fix_sequence.patch?view=markup
some spelling issues
https://anonscm.debian.org/viewvc/debian-med/trunk/packages/metaphlan2/trunk/debian/patches/spelling.patch?view=markup

Tin can also provide more info about the binary data in db_v20. The files
ending with "bt2" are created using a script in the Bowtie2 package
(bowtie2-build) using a sequence file Tin can provide (it can also be
recovered from the bt2 files with bowtie2-inspect if I remember well).

For the mpa_v20_m200.pkl Tin can also provide the uncompressed python
object (or he can provide a couple of lines of code to uncompress it?)

For the LEfSe package I just added the license in the bitbucket repository.
For the description, I think you can use the following page:
https://bitbucket.org/biobakery/biobakery/wiki/lefse
Does it sound like an appropriate description for the package?

Let me know if you have other questions or if I missed answering to other
emails.

thanks so much for your work!

Nicola


On Sat, Jul 30, 2016 at 10:47 PM Andreas Tille  wrote:

> Hi Nicola,
>
> did you received the two mails about licensing (+ in the case of lefse
> a description) ?
>
> Kind regards
>
>Andreas.
>
> On Wed, Jul 27, 2016 at 09:09:01AM +0200, Andreas Tille wrote:
> > Hi again,
> >
> > I have another question in addition to the one below.  I was packaging
> > MetaPhLan2 to make use of it in my final target package for metaBIT[1]
> > which in addition is using some Python code I've found here
> >
> >https://bitbucket.org/nsegata/lefse
> >
> > This code has neither any description I could use for the package nor a
> > license.  It would be really great if you could provide a basic
> > description and for packaging a free license would be needed.
> >
> > Kind regards
> >
> >Andreas.
> >
> > On Mon, Jul 25, 2016 at 09:45:57PM +0200, Andreas Tille wrote:
> > > Hi Nicola,
> > >
> > > we just had some conversation about pyphlan.  I now want to package
> > > MetaPhLan2 for Debian and I have prepared the needed packaging stuff.
> > > I did some small fixes
> > >
> > >
> https://anonscm.debian.org/viewvc/debian-med/trunk/packages/metaphlan2/trunk/debian/patches/fix_sequence.patch?view=markup
> > >
> > > and fixed some spelling issues
> > >
> > >
> https://anonscm.debian.org/viewvc/debian-med/trunk/packages/metaphlan2/trunk/debian/patches/spelling.patch?view=markup
> > >
> > > you might possibly want to take over in your upstream source.
> > >
> > > For the final upload to Debian I would need some information how the
> > > binary data files in db_v20 were created.  Debian requires somehow
> > > editable source for each file.  This is probably not possible for the
> > > files in question but we need to provide some information where the
> > > files are obtained from.
> > >
> > > Kind regards
> > >
> > >Andreas.
> >
> > [1] https://bitbucket.org/Glouvel/metabit/wiki/
> >
> > --
> > http://fam-tille.de
>
> --
> http://fam-tille.de
>


Re: Description for lefse tools (Was: Origin of data files in MetaPhLan2)

2016-07-30 Thread Andreas Tille
Hi Nicola,

did you received the two mails about licensing (+ in the case of lefse
a description) ?

Kind regards

   Andreas.

On Wed, Jul 27, 2016 at 09:09:01AM +0200, Andreas Tille wrote:
> Hi again,
> 
> I have another question in addition to the one below.  I was packaging
> MetaPhLan2 to make use of it in my final target package for metaBIT[1]
> which in addition is using some Python code I've found here
> 
>https://bitbucket.org/nsegata/lefse
> 
> This code has neither any description I could use for the package nor a
> license.  It would be really great if you could provide a basic
> description and for packaging a free license would be needed.
> 
> Kind regards
> 
>Andreas.
> 
> On Mon, Jul 25, 2016 at 09:45:57PM +0200, Andreas Tille wrote:
> > Hi Nicola,
> > 
> > we just had some conversation about pyphlan.  I now want to package
> > MetaPhLan2 for Debian and I have prepared the needed packaging stuff.
> > I did some small fixes
> > 
> >
> > https://anonscm.debian.org/viewvc/debian-med/trunk/packages/metaphlan2/trunk/debian/patches/fix_sequence.patch?view=markup
> > 
> > and fixed some spelling issues
> > 
> >
> > https://anonscm.debian.org/viewvc/debian-med/trunk/packages/metaphlan2/trunk/debian/patches/spelling.patch?view=markup
> > 
> > you might possibly want to take over in your upstream source.
> > 
> > For the final upload to Debian I would need some information how the
> > binary data files in db_v20 were created.  Debian requires somehow
> > editable source for each file.  This is probably not possible for the
> > files in question but we need to provide some information where the
> > files are obtained from.
> > 
> > Kind regards
> > 
> >Andreas.
> 
> [1] https://bitbucket.org/Glouvel/metabit/wiki/
> 
> -- 
> http://fam-tille.de

-- 
http://fam-tille.de