Re: [Pharo-users] Conditional CSV parsing

Sven Van Caekenberghe Mon, 26 Jan 2015 04:01:54 -0800

Hernán,

> On 26 Jan 2015, at 08:00, Hernán Morales Durand <hernan.mora...@gmail.com> 
> wrote:
> 
> It is possible :)
> I work with DNA sequences, there could be millions of common SNPs in a genome.


Still weird for CSV. How many record are there then ? I assume they all have 
the same number of fields ?

Anyway, could you point me to the specification of the format you want to read ?
And to the older the that you used to use ?

Thx,

Sven

> Cheers,
> 
> Hernán
> 
> 
> 2015-01-26 3:33 GMT-03:00 Sven Van Caekenberghe <s...@stfx.eu>:
> 
> > On 26 Jan 2015, at 06:32, Hernán Morales Durand <hernan.mora...@gmail.com> 
> > wrote:
> >
> >
> >
> > 2015-01-23 18:00 GMT-03:00 Sven Van Caekenberghe <s...@stfx.eu>:
> >
> > > On 23 Jan 2015, at 20:53, Hernán Morales Durand 
> > > <hernan.mora...@gmail.com> wrote:
> > >
> > > Hi Sven,
> > >
> > > 2015-01-23 16:06 GMT-03:00 Sven Van Caekenberghe <s...@stfx.eu>:
> > > Hi Hernán,
> > >
> > > > On 23 Jan 2015, at 19:50, Hernán Morales Durand 
> > > > <hernan.mora...@gmail.com> wrote:
> > > >
> > > > I used to use a CSV parser from Squeak where I could attach conditional 
> > > > iterations:
> > > >
> > > > csvParser rowsSkipFirst: 2 do: [: row | " some action ignoring first 2 
> > > > fields on each row " ].
> > > > csvParser rowsSkipLast: 2 do: [: row | " some action ignoring last 2 
> > > > fields on each row " ].
> > >
> > > With NeoCSVParser you can describe how each field is read and converted, 
> > > using the same mechanism you can ignore fields. Have a look at the 
> > > senders of #addIgnoredField from the unit tests.
> > >
> > >
> > > I am trying to understand the implementation, I see you included 
> > > #addIgnoredFields: for consecutive fields in 
> > > Neo-CSV-Core-SvenVanCaekenberghe.21
> > > A question about usage then, adding ignored field(s) requires adding 
> > > field types on all other remaining fields?
> >
> > Yes, like this:
> >
> > testReadWithIgnoredField
> >         | input |
> >         input := (String crlf join: #( '1,2,a,3' '1,2,b,3' '1,2,c,3' '')).
> >         self
> >                 assert: ((NeoCSVReader on: input readStream)
> >                                         addIntegerField;
> >                                         addIntegerField;
> >                                         addIgnoredField;
> >                                         addIntegerField;
> >                                         upToEnd)
> >                 equals: {
> >                         #(1 2 3).
> >                         #(1 2 3).
> >                         #(1 2 3).}
> >
> >
> >
> > May be you like to know if you make a pass to NeoCSV, for some data sets I 
> > have 1 million of columns, it would be nice an addFieldsInterval: or such.
> 
> 1 million columns ? How is that possible, useful ?
> 
> The reader is like a builder. You could try to do this yourself by writing a 
> little loop or two.
> 
> But still, 1 million ?
> 
> > Thank you.
> >
> > Hernán
> >
> 
> 
>

Re: [Pharo-users] Conditional CSV parsing

Reply via email to