Hello

Nobody has mentioned but you can use spark cassandra connector also.
Preferably if your data set is so big that a simple copy to csv cannot
handle it

Saludos

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay


On Fri, Jan 17, 2020 at 8:11 PM Durity, Sean R <sean_r_dur...@homedepot.com>
wrote:

> sstablekeys (in the tools directory?) can extract the actual keys from
> your sstables. You have to run it on each node and then combine and de-dupe
> the final results, but I have used this technique with a query generator to
> extract data more efficiently.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Chris Splinter <chris.splinter...@gmail.com>
> *Sent:* Friday, January 17, 2020 1:47 PM
> *To:* adrien ruffie <adriennolar...@hotmail.fr>
> *Cc:* user@cassandra.apache.org; Erick Ramirez <flightc...@gmail.com>
> *Subject:* [EXTERNAL] Re: COPY command with where condition
>
>
>
> Do you know your partition keys?
>
>
>
> One option could be to enumerate that list of partition keys in separate
> cmds to make the individual operations less expensive for the cluster.
>
>
>
> For example:
>
> Say your partition key column is called id and the ids in your database
> are [1,2,3]
>
>
>
> You could do
>
> ./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT *
> FROM probe_sensors WHERE id = 1 AND localisation_id = 208812" -url
> /home/dump
>
> ./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT *
> FROM probe_sensors WHERE id = 2 AND localisation_id = 208812" -url
> /home/dump
>
> ./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT *
> FROM probe_sensors WHERE id = 3 AND localisation_id = 208812" -url
> /home/dump
>
>
>
>
>
> Does that option work for you?
>
>
>
>
>
>
>
> On Fri, Jan 17, 2020 at 12:17 PM adrien ruffie <adriennolar...@hotmail.fr>
> wrote:
>
> I don't really know for the moment in production environment, but for
> developpment environment the table contains more than 10.000.000 rows.
>
> But we need just a sub dataset of this table not the entirety ...
> ------------------------------
>
> *De :* Chris Splinter <chris.splinter...@gmail.com>
> *Envoyé :* vendredi 17 janvier 2020 17:40
> *À :* adrien ruffie <adriennolar...@hotmail.fr>
> *Cc :* user@cassandra.apache.org <user@cassandra.apache.org>; Erick
> Ramirez <flightc...@gmail.com>
> *Objet :* Re: COPY command with where condition
>
>
>
> What you are seeing there is a standard read timeout, how many rows do you
> expect back from that query?
>
>
>
> On Fri, Jan 17, 2020 at 9:50 AM adrien ruffie <adriennolar...@hotmail.fr>
> wrote:
>
> Thank you very much,
>
>
>
>  so I do this request with for example -->
>
>
>
> ./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT *
> FROM probe_sensors WHERE localisation_id = 208812 ALLOW FILTERING" -url
> /home/dump
>
>
>
>
>
> But I get the following error
>
> com.datastax.dsbulk.executor.api.exception.BulkExecutionException:
> Statement execution failed: SELECT * FROM crt_sensors WHERE site_id =
> 208812 ALLOW FILTERING (Cassandra timeout during read query at consistency
> LOCAL_ONE (1 responses were required but only 0 replica responded))
>
>
>
> but I configured my driver with following driver.conf, but nothing work
> correctly. Do you know what is the problem ?
>
>
>
> datastax-java-driver {
>
>     basic {
>
>
>
>
>
>         contact-points = ["data1com:9042","data2.com:9042 [data2.com]
> <https://urldefense.com/v3/__http:/data2.com:9042__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKH7jCV5U$>
> "]
>
>
>
>         request {
>
>             timeout = "2000000"
>
>             consistency = "LOCAL_ONE"
>
>
>
>         }
>
>     }
>
>     advanced {
>
>
>
>         auth-provider {
>
>             class = PlainTextAuthProvider
>
>             username = "superuser"
>
>             password = "mypass"
>
>
>
>         }
>
>     }
>
> }
> ------------------------------
>
> *De :* Chris Splinter <chris.splinter...@gmail.com>
> *Envoyé :* vendredi 17 janvier 2020 16:17
> *À :* user@cassandra.apache.org <user@cassandra.apache.org>
> *Cc :* Erick Ramirez <flightc...@gmail.com>
> *Objet :* Re: COPY command with where condition
>
>
>
> DSBulk has an option that lets you specify the query ( including a WHERE
> clause )
>
>
>
> See Example 19 in this blog post for details: 
> https://www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading
> [datastax.com]
> <https://urldefense.com/v3/__https:/www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKBUuw2Cc$>
>
>
>
> On Fri, Jan 17, 2020 at 7:34 AM Jean Tremblay <
> jean.tremb...@zen-innovations.com> wrote:
>
> Did you think about using a Materialised View to generate what you want to
> keep, and then use DSBulk to extract the data?
>
>
>
> On 17 Jan 2020, at 14:30 , adrien ruffie <adriennolar...@hotmail.fr>
> wrote:
>
>
>
> Sorry I come back to a quick question about the bulk loader ...
>
>
>
> https://www.datastax.com/blog/2018/05/introducing-datastax-bulk-loader
> [datastax.com]
> <https://urldefense.com/v3/__https:/www.datastax.com/blog/2018/05/introducing-datastax-bulk-loader__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKLr1rFjk$>
>
>
>
> I read this : "Operations such as converting strings to lowercase,
> arithmetic on input columns, or filtering out rows based on some criteria,
> are not supported. "
>
>
>
> Consequently, it's still not possible to use a WHERE clause with DSBulk,
> right ?
>
>
>
> I don't really know how I can do it, in order to don't keep the wholeness
> of business data already stored and which don't need to export...
>
>
>
>
>
>
> ------------------------------
>
> *De :* adrien ruffie <adriennolar...@hotmail.fr>
> *Envoyé :* vendredi 17 janvier 2020 11:39
> *À :* Erick Ramirez <flightc...@gmail.com>; user@cassandra.apache.org <
> user@cassandra.apache.org>
> *Objet :* RE: COPY command with where condition
>
>
>
> Thank a lot !
>
> It's a good news for DSBulk ! I will take a look around this solution.
>
>
>
> best regards,
>
> Adrian
> ------------------------------
>
> *De :* Erick Ramirez <flightc...@gmail.com>
> *Envoyé :* vendredi 17 janvier 2020 10:02
> *À :* user@cassandra.apache.org <user@cassandra.apache.org>
> *Objet :* Re: COPY command with where condition
>
>
>
> The COPY command doesn't support filtering and it doesn't perform well for
> large tables.
>
>
>
> Have you considered the DSBulk tool from DataStax? Previously, it only
> worked with DataStax Enterprise but a few weeks ago, it was made free and
> works with open-source Apache Cassandra. For details, see this blogpost
> [datastax.com]
> <https://urldefense.com/v3/__https:/www.datastax.com/blog/2019/12/tools-for-apache-cassandra__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKg1mXfCU$>.
> Cheers!
>
>
>
> On Fri, Jan 17, 2020 at 6:57 PM adrien ruffie <adriennolar...@hotmail.fr>
> wrote:
>
> Hello all,
>
>
>
> In my company we want to export a big dataset of our cassandra's ring.
>
> We search to use COPY command but I don't find if and how can a WHERE
> condition can be use ?
>
>
>
> Because we need to export only several data which must be return by a
> WHERE closure, specially
>
> and unfortunately with ALLOW FILTERING due to several old tables which
> were poorly conceptualized...
>
>
>
> Do you know a means to do that please ?
>
>
>
> Thank all and best regards
>
>
>
> Adrian
>
>
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>

Reply via email to