Thanks for your answer.

Yes, I guess I will try to use these classes directly to access my data.

Best regards,

Le 29/12/2013 03:22, Cheolsoo Park a écrit :
I haven't done it myself, so I can't give you a detailed answer. But every
storage is associated with Input/outputFormat as well as
RecordReader/Writer.

As for BinStorage, you can take a look at BinStorageRecordReader-
https://github.com/apache/pig/blob/trunk/src/org/apache/pig/impl/io/BinStorageRecordReader.java#L40


On Thu, Dec 26, 2013 at 3:35 AM, Vincent Barat <vincent.ba...@gmail.com>wrote:

Hi all and merry Christmas !

I generate a file using a Pig script embedded in a Java process and store
it using a BinStorage.

Then, I would like to read this file directly from another Java client,
but without starting a Pig script (i.e only by using Hadoop API and Pig's
BinStorage class).
The goal is to achieve some real-time computation by scanning the file in
realtime, and so I cannot offer to start a Pig script to do the
computation, as the time overhead to start the script and get the result is
too long for my realtime objectives (I need a result in a few seconds).

Of course, I could use a JsonStorage and read my file using a Json
deserializer, but my guess is it would be much slower, and also painful to
handle the various parts generated for the output file (part-r-XXXXX).

Best regards,


    

--
Vincent BARAT
CTO, Capptain
p. +33 299 656 913
m. +33 615 411 518
a. 18 rue Tronchet, 75008 Paris, France
IMPORTANT: The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email by mistake, please notify the sender immediately and do not disclose the contents to anyone or make copies thereof.

Reply via email to