Thanks a lot Ayache for the links On Sun, Jan 4, 2015 at 8:49 PM, Ayache Khettar < ayache.khet...@googlemail.com> wrote:
> Hi > > HBase doesn't support XML query using xpath. For that you will have to > consider an XML database such as exist ( > http://exist-db.org/exist/apps/homepage/index.html) or MarkLogic (requires > commercial licence). If you still want to use Hbase then consider storing > metadata along the XML Payload with same row ID. You will have to think of > your queries first before making decision on how metadata you would want to > store. In one of the project I was involved in, we stored metadata data in > Apache solar using Hbase indexer (see cloudera product suite > > http://www.cloudera.com/content/cloudera/en/documentation/cloudera-search/v1-latest/Cloudera-Search-User-Guide/csug_use_hbase_indexer_service.html > ) > which is near real time update. So the payload xml ends up in hbase and the > metadata goes into apache solar. So you query against apache solar as > opposed to Hbase. > > There are various ways on how to achieve what you wanted and all down to > the choice of the technology and architecture drivers. > > all the best > > Ayache > > > > > On 4 January 2015 at 11:47, Shashidhar Rao <raoshashidhar...@gmail.com> > wrote: > > > Ayache and Chandrashekhar, > > > > You are correct, even I am reluctant to go for json transformation. > Storing > > xml in Hbase without transformation to json would be a lot easier at the > > storing stage. > > > > But, my concern is querying this xml data from HBase. Queries include > > aggregation, count and joins just to name a few. Can you please shed some > > lights on how to query xml data from Hbase , is it possible to use xquery > > or xpath? > > > > Json transformation was considered because of Mongodb, as it supports > > native json format and it seems to be good in analytics. Analytics would > > be at later stage. > > > > Can you please share some insights into xml querying from Hbase ,any > links > > would be helpful or any example , I am unable to find. > > > > Thanks in advance > > > > Shashi > > > > On Sun, Jan 4, 2015 at 4:18 PM, Ayache Khettar < > > ayache.khet...@googlemail.com> wrote: > > > > > Hi > > > > > > You could perfectly store XML into Hbase without any issue. All depends > > > what do with the XML. To query back the XML, you will have to store its > > > metadata with it using the same row ID. This way you could query back > > the > > > XML. I would go for JSON transformation only if the down stream flow > > needs > > > the payload in JSON format. > > > > > > Ayache > > > On 4 January 2015 at 10:07, Chandrashekhar Kotekar < > > > shekhar.kote...@gmail.com> wrote: > > > > > > > You can convert xml to json using map-reduce program and then store > > json > > > > into HBase but you need to decide what should be your row key. > > > > > > > > Another point you have to take into account is that if you want to > > search > > > > anything inside json or not. If you want to search inside json then > > HBase > > > > won't be best option for you. Probably you can switch to MongoDB or > > some > > > > other document store. > > > > > > > > Hope it helps... > > > > > > > > Regards, > > > > Chandrashekhar > > > > On 04-Jan-2015 3:32 PM, "Shashidhar Rao" <raoshashidhar...@gmail.com > > > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > Can someone guide me if the solution I am proposing is a feasible > > > option > > > > or > > > > > not > > > > > > > > > > 1. Large xml data is delivered through external system. > > > > > 2. Convert these into json format. > > > > > 3. Store it into HBASE ,even though there will be hardly any > updates > > , > > > > only > > > > > retrieval. I have looked at Hive but finally had to decide against > it > > > as > > > > > retrieval would be slow. > > > > > 4. Need to use Hadoop Nosql as other components are all using > Hadoop > > > > > ecosystem. > > > > > > > > > > Can xml data be directly stored into Hbase without any > > > > > transformation.(second question) > > > > > > > > > > Any suggestions on storing xml data on Nosql. (only open source and > > no > > > > > commercial nosql) > > > > > > > > > > Thanks in advance > > > > > > > > > > Shashi > > > > > > > > > > > > > > >