Hi,
Which version of Cassandra should considered most stable in the version 3?
I see two main branch: the branch with the version 3.0.* and the tick-tock one
3.*.*.
So basically my question is: which one is most stable, version 3.0.5 or version
3.3?
I know odd versions in tick-took are bug fix.
Check out the text indexing feature of the new SASI feature in Cassandra
3.4. You could write a custom tokenizer to extract entities and then be
able to query for documents that contain those entities.
That said, using a SHA digest key for the primary key has merit for direct
access to the
S3 maybe?
On Mon, Apr 11, 2016 at 7:05 PM Robert Wille wrote:
> I do realize its kind of a weird use case, but it is legitimate. I have a
> collection of documents that I need to index, and I want to perform entity
> extraction on them and give the extracted entities special
I do realize its kind of a weird use case, but it is legitimate. I have a
collection of documents that I need to index, and I want to perform entity
extraction on them and give the extracted entities special treatment in my
full-text index. Because entity extraction costs money, and each
Check your environment variables, looks like JAVA_HOME is not properly set
On Mon, Apr 11, 2016 at 9:07 AM, Lokesh Ceeba - Vendor <
lokesh.ce...@walmart.com> wrote:
> Hi Team,
>
> Help required
>
>
>
> cassandra:/app/cassandra $ nodetool status
>
>
>
> Cassandra 2.0 and later
Hi Robert,
why do you need the actual text as a key? I sounds a bit unatural at
least for me. Keep in mind that you cannot do "like" queries on keys in
cassandra. For performance and keeping things more readable I would
prefer hashing your text and use the hash as key.
You should also take
On Mon, Apr 11, 2016 at 4:19 PM, Jack Krupansky
wrote:
> Some of this may depend on exactly how you are using so-called COMPACT
> STORAGE. I mean, if your tables really are modeled as all but exactly one
> column in the primary key, then okay, COMPACT STORAGE may be a
Why does the text need to be the key?
On Mon, Apr 11, 2016 at 6:04 PM Robert Wille wrote:
> I have a need to be able to use the text of a document as the primary key
> in a table. These texts are usually less than 1K, but can sometimes be 10’s
> of K’s in size. Would it be
While large primary keys (within reason) should work, IMO anytime you're
doing equality testing you are really better off minimizing the size of the
key. Huge primary keys will also have very negative impacts on your key
cache. I would err on the side of the digest, but I've never had a need for
I have a need to be able to use the text of a document as the primary key in a
table. These texts are usually less than 1K, but can sometimes be 10’s of K’s
in size. Would it be better to use a digest of the text as the key? I have a
background process that will occasionally need to do a full
Hello,
In a multi-DC setup (where one DC serves real-time traffic and the other DC
serves up analytical loads), is it possible to setup and restrict secondary
indexes only to the analytics DC? The intent is to not create the overhead of
the secondary index on the DC where real-time traffic is
Since when did this become a DataStax support email list? If folks have
questions about DataStax products, shouldn't they be contacting the company
directly?
On Sun, Apr 10, 2016 at 1:13 PM Jeff Jirsa
wrote:
> It is possible to use OpsCenter for open source /
Thanks Jim. I think you understand the pain of migrating TBs of data to new
tables. There is no command to change from compact to non compact storage and
the fastest solution to migrate data using Spark is too slow for production
systems.
And the pain gets bigger when your performance dips
Jack, the Datastax link he posted (
http://www.datastax.com/dev/blog/thrift-to-cql3) says that for column
families with mixed dynamic and static columns: "The only solution to be
able to access the column family fully is to remove the declared columns
from the thrift schema altogether..." I think
Disclaimer: This message and the information contained herein is proprietary
and confidential and subject to the Tech Mahindra policy statement, you may
review the
You're not mistaken, just thought you were after partition keys and didn't
read the question that carefully. Afaik, you're SOOL if you need to
distinguish clustering keys as unique. Well, other than doing a full table
scan of course, which I'm assuming is not too plausible.
On Mon, 11 Apr 2016 at
Unless I'm mistaken, nodetool tablestats gives you the number of partitions
(partition keys), not the number of primary keys. IOW, the term "keys" is
ambiguous. That's why I phrased the original question as count of (CQL)
rows, to distinguish from the pre-CQL3 concept of a partition being treated
Wouldn't the "number of keys" part of *nodetool cfstats* run on every node,
summed and divided by replication factor give you a decent approximation?
Or are you really after a completely precise number?
On Mon, 11 Apr 2016 at 16:18 Jack Krupansky
wrote:
> Agreed, that
Scott Thompson
This message and any attached documents are only for the use of the intended
recipient(s), are confidential and may contain privileged information. Any
unauthorized review, use, retransmission, or other disclosure is strictly
prohibited.
Sorry, but your message is too confusing - you say "reading dynamic columns
in CQL" and "make the table schema less", but neither has any relevance to
CQL! 1. CQL tables always have schemas. 2. All columns in CQL are
statically declared (even maps/collections are statically declared
columns.)
Agreed, that anything requiring a full table scan, short of batch
analytics,is an antipattern, although the goal is not to do a full scan per
se, but just get the row count. It still surprises people that Cassandra
cannot quickly get COUNT(*). The easy answer: Use DSE Search and do a Solr
query
Cassandra is not good for table scan type queries (which count(*) typically
is). While there are some attempts to do that (as noted below), this is a path
I avoid.
Sean Durity
From: Max C [mailto:mc_cassan...@core43.com]
Sent: Saturday, April 09, 2016 6:19 PM
To: user@cassandra.apache.org
Any comments or suggestions on this one?
ThanksAnuj
Sent from Yahoo Mail on Android
On Sun, 10 Apr, 2016 at 11:39 PM, Anuj Wadehra wrote:
Hi
We are on 2.0.14 and Thrift. We are planning to migrate to CQL soon but facing
some challenges.
We have a cf with a mix of
The Cassandra team is pleased to announce the release of Apache Cassandra
version 3.0.5.
Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.
http://cassandra.apache.org/
Downloads of source
Where do you get the ~1ms latency between AZs? Comparing a short term
average to a 99th percentile isn't very fair.
"Over the last month, the median is 2.09 ms, 90th percentile is 20ms,
99th percentile
is 47ms." - per
unsubscribe
Thanks Alain for all your answer:
- In a few days I am going to set up a maintenance window so I can
test again to run repairs and see what happens. Definitely I will run 'iostat
-mx 5 100' On that time and also use the command you pointed to see why is
consuming so much power.
-
Hi Hannu,
Thank you for the pointer. We ended up using materialized views in
cassandra 3.0.3. Seems to do the trick :)
tor. 17. mar. 2016 kl. 11.16 skrev Hannu Kröger :
> Hi,
>
> That’s how I have done it in many occasions. Nowadays there is the
> possibility use Cassandra
28 matches
Mail list logo