Check out this datastax article

http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsBulkloader_t.html




And code examples can be found here




https://github.com/PatrickCallaghan/datastax-bulkloader-writer-example




You can write a writer in scala or Java which will convert csv et into ss 
tables and then use sstableloader to load direct into Cassandra




K


-- 

Keith Sterling

Head of Software





E: keith.sterl...@first-utility.com



P: +44 7771 597 630



W: first-utility.com



A: Opus 40 Business Park, 


Haywood Road, Warwick CV34 5AH

On Sat, Dec 27, 2014 at 1:11 PM, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> Sorry, but you are still not being clear. In particular, "website data" has
> no common, defined meaning. You'll need to use some standard, defined
> terminology or specific examples so that we can have some idea what you are
> referring to.
> The blog post you cited is referring to the Twitter API, presumably to read
> tweets. Okay, fine, but you'll have to be more specific about what you want
> to do with them. Yes, Cassandra is primarily focus on structured data, but
> you can of course store unstructured and semi-structured data as blobs,
> JSON strings, map columns, etc.
> Please describe in a little more detail what problem you are trying to
> solve.
> I mean, "website data" might mean any data (in any format) stored at a web
> URL, which might be a web "page", a data file linked by a web page, or...
> it could be a REST API like Twitter). Or it could be... whatever. Cassandra
> is basically a storage engine - it can store anything. There are a wide
> variety of tools that can be used to "ingest" data from the infinite
> variety of "sources" for data. But you'll need to state more specifically
> what you are actually tring to accomplish.
> Also, "large data" could be... anything, like "Big Data". So more
> specificity is needed.
> Alternatively, you could hire a consultant to help guide you through the
> "application analysis" process to determine your "application
> requirements", and then you could simply post your application
> requirements, or at least a concise summary or relevant excerpt.
> -- Jack Krupansky
> -- Jack Krupansky
> On Sat, Dec 27, 2014 at 1:48 AM, Joanne Contact <joannenetw...@gmail.com>
> wrote:
>> Thank you. I did not express clearly on my question.
>>
>> I wonder if there is sample code to load any website data to Cassandra?
>>
>> Say, this webpage http://datatomix.com/?p=84 seems to use Python, tweepy,
>> to use twitter API to get data in json format and then load data into
>> Cassandra.
>>
>> So it seems tweepy is special for twitter API. Is there a code for any
>> website?
>> Btw I am not familiar with Python yet. So the answer may not be limited to
>> Python.
>>
>> Thanks!
>>
>> On Fri, Dec 26, 2014 at 12:46 PM, Keith Sterling <
>> keith.sterl...@first-utility.com> wrote:
>>
>>> Take a look at sstableloader. We use it to load 30+m rows into Cassandra
>>>
>>> Datastax documentation is a good staty
>>>
>>> --
>>> *Keith Sterling*
>>> *Head of Software*
>>>
>>>  *E:* keith.sterl...@first-utility.com <stephen.l...@first-utility.com>
>>>  *P:* +44 7771 597 630
>>>  *W:* first-utility.com <http://www.first-utility.com/>
>>>  *A:* Opus 40 Business Park,
>>> Haywood Road, Warwick CV34 5AH
>>>
>>>
>>>
>>> On Fri, Dec 26, 2014 at 7:59 PM, Joanne Contact <joannenetw...@gmail.com>
>>> wrote:
>>>
>>>>  Hello I am new. Did not seem to find the answer after a brief
>>>> research. Please help.
>>>>
>>>> Thanks!
>>>>
>>>> J
>>>>
>>>>
>>>
>>

Reply via email to