Its this kind of thing that forced to move to sequence files instead of
TextKeyValueInput format and other text based/ csv based formats. Kind of
regretting the decision to go with tab separated format for BayesClassifier
which i wrote it 2 years ago. I will be modifying this to use sparse vectors
or the sequence files which ever fits.

My thought is that this kind of functionality should only be used by the
format convertors that convert to and back from sequence files. and when
storing it to sequence files just enforce the \n rule for line breaks

Robin



On Mon, Jan 18, 2010 at 5:34 PM, Sean Owen <sro...@gmail.com> wrote:

> As I troll through the code at times trying to polish here and there I
> notice small issues to bring up --
>
> Line separators. Lots of code independently reads
> System.getProperty("line.separator") in order to output a platform
> specific line break. I argue this is actually slightly bad, since it
> means the input/output formats of Mahout aren't fixed at all, but can
> vary by platform. Output on Windows isn't read properly by Unix, etc.,
> perhaps.
>
> It'd be simpler and more compatible to use '\n' always. Thoughts?
>
> (And, recall we don't really support Windows so well anyway, which is
> the odd man out in this regard.)
>

Reply via email to