On Wed, Jun 18, 2014 at 12:03 PM, Dmitriy Lyubimov <[email protected]> wrote:
> > How important are the String row keys for the algorithms itself? Would it > > grossly mess up a workflow if Strings are silently discarded by the > > backend? > > > > like i said, seq2sparse produces them, and postprocessing for stuff like > LSA pipelines would not work. Something as coarse as translating to a dictionary index would probably work. Creating the dictionary in parallel while reading the data should be quite doable.
