Check out this datastax article
http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsBulkloader_t.html And code examples can be found here https://github.com/PatrickCallaghan/datastax-bulkloader-writer-example You can write a writer in scala or Java which will convert csv et into ss tables and then use sstableloader to load direct into Cassandra K -- Keith Sterling Head of Software E: keith.sterl...@first-utility.com P: +44 7771 597 630 W: first-utility.com A: Opus 40 Business Park, Haywood Road, Warwick CV34 5AH On Sat, Dec 27, 2014 at 1:11 PM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > Sorry, but you are still not being clear. In particular, "website data" has > no common, defined meaning. You'll need to use some standard, defined > terminology or specific examples so that we can have some idea what you are > referring to. > The blog post you cited is referring to the Twitter API, presumably to read > tweets. Okay, fine, but you'll have to be more specific about what you want > to do with them. Yes, Cassandra is primarily focus on structured data, but > you can of course store unstructured and semi-structured data as blobs, > JSON strings, map columns, etc. > Please describe in a little more detail what problem you are trying to > solve. > I mean, "website data" might mean any data (in any format) stored at a web > URL, which might be a web "page", a data file linked by a web page, or... > it could be a REST API like Twitter). Or it could be... whatever. Cassandra > is basically a storage engine - it can store anything. There are a wide > variety of tools that can be used to "ingest" data from the infinite > variety of "sources" for data. But you'll need to state more specifically > what you are actually tring to accomplish. > Also, "large data" could be... anything, like "Big Data". So more > specificity is needed. > Alternatively, you could hire a consultant to help guide you through the > "application analysis" process to determine your "application > requirements", and then you could simply post your application > requirements, or at least a concise summary or relevant excerpt. > -- Jack Krupansky > -- Jack Krupansky > On Sat, Dec 27, 2014 at 1:48 AM, Joanne Contact <joannenetw...@gmail.com> > wrote: >> Thank you. I did not express clearly on my question. >> >> I wonder if there is sample code to load any website data to Cassandra? >> >> Say, this webpage http://datatomix.com/?p=84 seems to use Python, tweepy, >> to use twitter API to get data in json format and then load data into >> Cassandra. >> >> So it seems tweepy is special for twitter API. Is there a code for any >> website? >> Btw I am not familiar with Python yet. So the answer may not be limited to >> Python. >> >> Thanks! >> >> On Fri, Dec 26, 2014 at 12:46 PM, Keith Sterling < >> keith.sterl...@first-utility.com> wrote: >> >>> Take a look at sstableloader. We use it to load 30+m rows into Cassandra >>> >>> Datastax documentation is a good staty >>> >>> -- >>> *Keith Sterling* >>> *Head of Software* >>> >>> *E:* keith.sterl...@first-utility.com <stephen.l...@first-utility.com> >>> *P:* +44 7771 597 630 >>> *W:* first-utility.com <http://www.first-utility.com/> >>> *A:* Opus 40 Business Park, >>> Haywood Road, Warwick CV34 5AH >>> >>> >>> >>> On Fri, Dec 26, 2014 at 7:59 PM, Joanne Contact <joannenetw...@gmail.com> >>> wrote: >>> >>>> Hello I am new. Did not seem to find the answer after a brief >>>> research. Please help. >>>> >>>> Thanks! >>>> >>>> J >>>> >>>> >>> >>