bq. Then use hbase +1
On Fri, Sep 4, 2015 at 9:00 AM, Jörn Franke <jornfra...@gmail.com> wrote: > Then use hbase or similar. You originally wrote it was just for storing. > > Le ven. 4 sept. 2015 à 16:30, Tao Lu <taolu2...@gmail.com> a écrit : > >> Basically they need NOSQL like random update access. >> >> >> >> >> >> On Fri, Sep 4, 2015 at 9:56 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> What about concurrent access (read / update) to the small file with same >>> key ? >>> >>> That can get a bit tricky. >>> >>> On Thu, Sep 3, 2015 at 2:47 PM, Jörn Franke <jornfra...@gmail.com> >>> wrote: >>> >>>> Well it is the same as in normal hdfs, delete file and put a new one >>>> with the same name works. >>>> >>>> Le jeu. 3 sept. 2015 à 21:18, <nib...@free.fr> a écrit : >>>> >>>>> HAR archive seems a good idea , but just a last question to be sure to >>>>> do the best choice : >>>>> - Is it possible to override (remove/replace) a file inside the HAR ? >>>>> Basically the name of my small files will be the keys of my records , >>>>> and sometimes I will need to replace the content of a file by a new >>>>> content >>>>> (remove/replace) >>>>> >>>>> >>>>> Tks a lot >>>>> Nicolas >>>>> >>>>> ----- Mail original ----- >>>>> De: "Jörn Franke" <jornfra...@gmail.com> >>>>> À: nib...@free.fr >>>>> Cc: user@spark.apache.org >>>>> Envoyé: Jeudi 3 Septembre 2015 19:29:42 >>>>> Objet: Re: Small File to HDFS >>>>> >>>>> >>>>> >>>>> Har is transparent and hardly any performance overhead. You may decide >>>>> not to compress or use a fast compression algorithm, such as snappy >>>>> (recommended) >>>>> >>>>> >>>>> >>>>> Le jeu. 3 sept. 2015 à 16:17, < nib...@free.fr > a écrit : >>>>> >>>>> >>>>> My main question in case of HAR usage is , is it possible to use Pig >>>>> on it and what about performances ? >>>>> >>>>> ----- Mail original ----- >>>>> De: "Jörn Franke" < jornfra...@gmail.com > >>>>> À: nib...@free.fr , user@spark.apache.org >>>>> Envoyé: Jeudi 3 Septembre 2015 15:54:42 >>>>> Objet: Re: Small File to HDFS >>>>> >>>>> >>>>> >>>>> >>>>> Store them as hadoop archive (har) >>>>> >>>>> >>>>> Le mer. 2 sept. 2015 à 18:07, < nib...@free.fr > a écrit : >>>>> >>>>> >>>>> Hello, >>>>> I'am currently using Spark Streaming to collect small messages >>>>> (events) , size being <50 KB , volume is high (several millions per day) >>>>> and I have to store those messages in HDFS. >>>>> I understood that storing small files can be problematic in HDFS , how >>>>> can I manage it ? >>>>> >>>>> Tks >>>>> Nicolas >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>> >>>>> >>> >> >> >> -- >> ------------------------------------------------ >> Thanks! >> Tao >> >