Ayache and Chandrashekhar, You are correct, even I am reluctant to go for json transformation. Storing xml in Hbase without transformation to json would be a lot easier at the storing stage.
But, my concern is querying this xml data from HBase. Queries include aggregation, count and joins just to name a few. Can you please shed some lights on how to query xml data from Hbase , is it possible to use xquery or xpath? Json transformation was considered because of Mongodb, as it supports native json format and it seems to be good in analytics. Analytics would be at later stage. Can you please share some insights into xml querying from Hbase ,any links would be helpful or any example , I am unable to find. Thanks in advance Shashi On Sun, Jan 4, 2015 at 4:18 PM, Ayache Khettar < ayache.khet...@googlemail.com> wrote: > Hi > > You could perfectly store XML into Hbase without any issue. All depends > what do with the XML. To query back the XML, you will have to store its > metadata with it using the same row ID. This way you could query back the > XML. I would go for JSON transformation only if the down stream flow needs > the payload in JSON format. > > Ayache > On 4 January 2015 at 10:07, Chandrashekhar Kotekar < > shekhar.kote...@gmail.com> wrote: > > > You can convert xml to json using map-reduce program and then store json > > into HBase but you need to decide what should be your row key. > > > > Another point you have to take into account is that if you want to search > > anything inside json or not. If you want to search inside json then HBase > > won't be best option for you. Probably you can switch to MongoDB or some > > other document store. > > > > Hope it helps... > > > > Regards, > > Chandrashekhar > > On 04-Jan-2015 3:32 PM, "Shashidhar Rao" <raoshashidhar...@gmail.com> > > wrote: > > > > > Hi, > > > > > > Can someone guide me if the solution I am proposing is a feasible > option > > or > > > not > > > > > > 1. Large xml data is delivered through external system. > > > 2. Convert these into json format. > > > 3. Store it into HBASE ,even though there will be hardly any updates , > > only > > > retrieval. I have looked at Hive but finally had to decide against it > as > > > retrieval would be slow. > > > 4. Need to use Hadoop Nosql as other components are all using Hadoop > > > ecosystem. > > > > > > Can xml data be directly stored into Hbase without any > > > transformation.(second question) > > > > > > Any suggestions on storing xml data on Nosql. (only open source and no > > > commercial nosql) > > > > > > Thanks in advance > > > > > > Shashi > > > > > >