Thanks Lewis, I look forward to hearing more.

Best Regards

Dan

Dan Hanley
CTO, ActiveStandards
Direct: +44 (0)207 019 4718
Switchboard: +44 (0)20 7019 4700
dan.han...@activestandards.com<mailto:dan.han...@activestandards.com>

www.activestandards.com<http://www.activestandards.com>
________________________________

Driving Digital Transformation:
ActiveStandards launches new enterprise digital governance 
solutions<https://activestandards.com/about-us/newsroom/driving-digital-transformation-activestandards-launches-new-enterprise-digital>

________________________________

ActiveStandards, Studio 1001 Highgate Studios, 53-79 Highgate Road, London, NW5 
1TL
Registered in England: No. 3592714, VAT No. 625574723
From: Lewis John Mcgibbney [mailto:lewis.mcgibb...@gmail.com]
Sent: 05 December 2014 15:23
To: <user@gora.apache.org>
Subject: Re: Cassandra named fields support

Hi Dan,
I am currently working on implementing GORA-267 [0] Cassandra composite primary 
key support within the context of the gora-cassandra module.
I agree with you that the physical mapping you see is not easy for unpacking 
and parsing within Spark. We also permit use legacy super columns within 
gora-cassandra which we should emigrate from.
I'll look into the GoraCassandra codebase soon enough and provide more detail 
on what you/we would need to meet your requirements.
Thanks
Lewis

[0] https://issues.apache.org/jira/browse/GORA-267

On Fri, Dec 5, 2014 at 5:56 AM, Dan Hanley 
<dan.han...@activestandards.com<mailto:dan.han...@activestandards.com>> wrote:
Hi
I’m using Gora (0.3) to pipe Nutch (2.2.1) data into Cassandra, eventually I’m 
hoping to analyse it with Spark.

The Gora-Cassandra mapping puts everything in three legacy style Cassandra 
tables, f, p and sc all created roughly like:

CREATE TABLE p (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY ((key), column1)
) WITH COMPACT STORAGE AND….

This is not easy to parse as an RDD in Spark.

It would be easier if e.g. the mapping:

<field name="title" family="p" qualifier="t"/>
<field name="text" family="p" qualifier="c"/>
<field name="signature" family="p" qualifier="sig"/>
<field name="prevSignature" family="p" qualifier="psig"/>

Produced a table like:

CREATE TABLE p (
  key blob,
  title blob,
  text blob,
 signature blob,
 prevSignature blob
  PRIMARY KEY (key)
) ….

So my question – is this something that is possible in more recent versions of 
Gora? Or if not would it be something I could reasonably expect to develop 
myself (I have no familiarity with the Gora codebase… any pointers would be 
welcome)

Best Regards

Dan


Dan Hanley
CTO, ActiveStandards
Direct: +44 (0)207 019 4718<tel:%2B44%20%280%29207%20019%204718>
Switchboard: +44 (0)20 7019 4700<tel:%2B44%20%280%2920%207019%204700>
dan.han...@activestandards.com<mailto:dan.han...@activestandards.com>

www.activestandards.com<http://www.activestandards.com>
________________________________

Driving Digital Transformation:
ActiveStandards launches new enterprise digital governance 
solutions<https://activestandards.com/about-us/newsroom/driving-digital-transformation-activestandards-launches-new-enterprise-digital>

________________________________

ActiveStandards, Studio 1001 Highgate Studios, 53-79 Highgate Road, London, NW5 
1TL
Registered in England: No. 3592714, VAT No. 625574723



--
Lewis

Reply via email to