Hello Nobody has mentioned but you can use spark cassandra connector also. Preferably if your data set is so big that a simple copy to csv cannot handle it
Saludos Jean Carlo "The best way to predict the future is to invent it" Alan Kay On Fri, Jan 17, 2020 at 8:11 PM Durity, Sean R <sean_r_dur...@homedepot.com> wrote: > sstablekeys (in the tools directory?) can extract the actual keys from > your sstables. You have to run it on each node and then combine and de-dupe > the final results, but I have used this technique with a query generator to > extract data more efficiently. > > > > > > Sean Durity > > > > *From:* Chris Splinter <chris.splinter...@gmail.com> > *Sent:* Friday, January 17, 2020 1:47 PM > *To:* adrien ruffie <adriennolar...@hotmail.fr> > *Cc:* user@cassandra.apache.org; Erick Ramirez <flightc...@gmail.com> > *Subject:* [EXTERNAL] Re: COPY command with where condition > > > > Do you know your partition keys? > > > > One option could be to enumerate that list of partition keys in separate > cmds to make the individual operations less expensive for the cluster. > > > > For example: > > Say your partition key column is called id and the ids in your database > are [1,2,3] > > > > You could do > > ./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT * > FROM probe_sensors WHERE id = 1 AND localisation_id = 208812" -url > /home/dump > > ./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT * > FROM probe_sensors WHERE id = 2 AND localisation_id = 208812" -url > /home/dump > > ./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT * > FROM probe_sensors WHERE id = 3 AND localisation_id = 208812" -url > /home/dump > > > > > > Does that option work for you? > > > > > > > > On Fri, Jan 17, 2020 at 12:17 PM adrien ruffie <adriennolar...@hotmail.fr> > wrote: > > I don't really know for the moment in production environment, but for > developpment environment the table contains more than 10.000.000 rows. > > But we need just a sub dataset of this table not the entirety ... > ------------------------------ > > *De :* Chris Splinter <chris.splinter...@gmail.com> > *Envoyé :* vendredi 17 janvier 2020 17:40 > *À :* adrien ruffie <adriennolar...@hotmail.fr> > *Cc :* user@cassandra.apache.org <user@cassandra.apache.org>; Erick > Ramirez <flightc...@gmail.com> > *Objet :* Re: COPY command with where condition > > > > What you are seeing there is a standard read timeout, how many rows do you > expect back from that query? > > > > On Fri, Jan 17, 2020 at 9:50 AM adrien ruffie <adriennolar...@hotmail.fr> > wrote: > > Thank you very much, > > > > so I do this request with for example --> > > > > ./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT * > FROM probe_sensors WHERE localisation_id = 208812 ALLOW FILTERING" -url > /home/dump > > > > > > But I get the following error > > com.datastax.dsbulk.executor.api.exception.BulkExecutionException: > Statement execution failed: SELECT * FROM crt_sensors WHERE site_id = > 208812 ALLOW FILTERING (Cassandra timeout during read query at consistency > LOCAL_ONE (1 responses were required but only 0 replica responded)) > > > > but I configured my driver with following driver.conf, but nothing work > correctly. Do you know what is the problem ? > > > > datastax-java-driver { > > basic { > > > > > > contact-points = ["data1com:9042","data2.com:9042 [data2.com] > <https://urldefense.com/v3/__http:/data2.com:9042__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKH7jCV5U$> > "] > > > > request { > > timeout = "2000000" > > consistency = "LOCAL_ONE" > > > > } > > } > > advanced { > > > > auth-provider { > > class = PlainTextAuthProvider > > username = "superuser" > > password = "mypass" > > > > } > > } > > } > ------------------------------ > > *De :* Chris Splinter <chris.splinter...@gmail.com> > *Envoyé :* vendredi 17 janvier 2020 16:17 > *À :* user@cassandra.apache.org <user@cassandra.apache.org> > *Cc :* Erick Ramirez <flightc...@gmail.com> > *Objet :* Re: COPY command with where condition > > > > DSBulk has an option that lets you specify the query ( including a WHERE > clause ) > > > > See Example 19 in this blog post for details: > https://www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading > [datastax.com] > <https://urldefense.com/v3/__https:/www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKBUuw2Cc$> > > > > On Fri, Jan 17, 2020 at 7:34 AM Jean Tremblay < > jean.tremb...@zen-innovations.com> wrote: > > Did you think about using a Materialised View to generate what you want to > keep, and then use DSBulk to extract the data? > > > > On 17 Jan 2020, at 14:30 , adrien ruffie <adriennolar...@hotmail.fr> > wrote: > > > > Sorry I come back to a quick question about the bulk loader ... > > > > https://www.datastax.com/blog/2018/05/introducing-datastax-bulk-loader > [datastax.com] > <https://urldefense.com/v3/__https:/www.datastax.com/blog/2018/05/introducing-datastax-bulk-loader__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKLr1rFjk$> > > > > I read this : "Operations such as converting strings to lowercase, > arithmetic on input columns, or filtering out rows based on some criteria, > are not supported. " > > > > Consequently, it's still not possible to use a WHERE clause with DSBulk, > right ? > > > > I don't really know how I can do it, in order to don't keep the wholeness > of business data already stored and which don't need to export... > > > > > > > ------------------------------ > > *De :* adrien ruffie <adriennolar...@hotmail.fr> > *Envoyé :* vendredi 17 janvier 2020 11:39 > *À :* Erick Ramirez <flightc...@gmail.com>; user@cassandra.apache.org < > user@cassandra.apache.org> > *Objet :* RE: COPY command with where condition > > > > Thank a lot ! > > It's a good news for DSBulk ! I will take a look around this solution. > > > > best regards, > > Adrian > ------------------------------ > > *De :* Erick Ramirez <flightc...@gmail.com> > *Envoyé :* vendredi 17 janvier 2020 10:02 > *À :* user@cassandra.apache.org <user@cassandra.apache.org> > *Objet :* Re: COPY command with where condition > > > > The COPY command doesn't support filtering and it doesn't perform well for > large tables. > > > > Have you considered the DSBulk tool from DataStax? Previously, it only > worked with DataStax Enterprise but a few weeks ago, it was made free and > works with open-source Apache Cassandra. For details, see this blogpost > [datastax.com] > <https://urldefense.com/v3/__https:/www.datastax.com/blog/2019/12/tools-for-apache-cassandra__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKg1mXfCU$>. > Cheers! > > > > On Fri, Jan 17, 2020 at 6:57 PM adrien ruffie <adriennolar...@hotmail.fr> > wrote: > > Hello all, > > > > In my company we want to export a big dataset of our cassandra's ring. > > We search to use COPY command but I don't find if and how can a WHERE > condition can be use ? > > > > Because we need to export only several data which must be return by a > WHERE closure, specially > > and unfortunately with ALLOW FILTERING due to several old tables which > were poorly conceptualized... > > > > Do you know a means to do that please ? > > > > Thank all and best regards > > > > Adrian > > > > > ------------------------------ > > The information in this Internet Email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this Email > by anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be > taken in reliance on it, is prohibited and may be unlawful. When addressed > to our clients any opinions or advice contained in this Email are subject > to the terms and conditions expressed in any applicable governing The Home > Depot terms of business or client engagement letter. The Home Depot > disclaims all responsibility and liability for the accuracy and content of > this attachment and for any damages or losses arising from any > inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other > items of a destructive nature, which may be contained in this attachment > and shall not be liable for direct, indirect, consequential or special > damages in connection with this e-mail message or its attachment. >