Ayache and Chandrashekhar,

You are correct, even I am reluctant to go for json transformation. Storing
xml in Hbase without  transformation to json would be a lot easier at the
storing stage.

But, my concern is querying this xml data from HBase. Queries include
aggregation, count and joins just to name a few. Can you please shed some
lights on how to query xml data from Hbase , is it possible to use xquery
or xpath?

Json transformation was considered because of Mongodb, as it supports
native json format and  it seems to be good in analytics. Analytics would
be at later stage.

Can you please share some insights into xml querying from Hbase ,any links
would be helpful or any example , I am unable to find.

Thanks in advance

Shashi

On Sun, Jan 4, 2015 at 4:18 PM, Ayache Khettar <
ayache.khet...@googlemail.com> wrote:

> Hi
>
> You could perfectly store XML into Hbase without any issue. All depends
> what do with the XML. To query back the XML, you will have to store its
>  metadata with it using the same row ID. This way you could query back the
> XML. I would go for JSON transformation only if the down stream flow needs
> the payload in JSON format.
>
> Ayache
> On 4 January 2015 at 10:07, Chandrashekhar Kotekar <
> shekhar.kote...@gmail.com> wrote:
>
> > You can convert xml to json using map-reduce program and then store json
> > into HBase but you need to decide what should be your row key.
> >
> > Another point you have to take into account is that if you want to search
> > anything inside json or not. If you want to search inside json then HBase
> > won't be best option for you. Probably you can switch to MongoDB or some
> > other document store.
> >
> > Hope it helps...
> >
> > Regards,
> > Chandrashekhar
> > On 04-Jan-2015 3:32 PM, "Shashidhar Rao" <raoshashidhar...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Can someone guide me if the solution I am proposing is a feasible
> option
> > or
> > > not
> > >
> > > 1. Large xml data is delivered through external system.
> > > 2. Convert these into json format.
> > > 3. Store it into HBASE ,even though there will be hardly any updates ,
> > only
> > > retrieval. I have looked at Hive but finally had to decide against it
> as
> > > retrieval would be slow.
> > > 4. Need to use Hadoop Nosql as other components are all using Hadoop
> > > ecosystem.
> > >
> > > Can xml data be directly stored into Hbase without any
> > > transformation.(second question)
> > >
> > > Any suggestions on storing xml data on Nosql. (only open source and no
> > > commercial nosql)
> > >
> > > Thanks in advance
> > >
> > > Shashi
> > >
> >
>

Reply via email to