Re: row level atomicity and isolation

2018-05-16 Thread Rajesh Kishore
ok got it. So, only using LWT txn the updates across nodes for a particular
row can be isolated, so basically paxos would ensure serializable isolation


Thanks,
Rajesh

On Wed, May 16, 2018 at 4:56 PM, kurt greaves  wrote:

> Atomicity and isolation are only guaranteed within a replica. If you have
> multiple concurrent requests across replicas last timestamp will win. You
> can get better isolation using LWT which uses paxos under the hood.
>
> On 16 May 2018 at 08:55, Rajesh Kishore  wrote:
>
>> Hi,
>>
>> I am just curious to know when Cassandra doc says the atomicity and
>> isolation is guaranteed for a row.
>> Does it mean, two requests updating a row- "R1" at different replica will
>> be candidate for atomicity and isolation?
>>
>> For instance , I have a setup where RF is 2
>> I have a client application where two requests of updating a particular
>> row #R1 goes to two coordinator nodes at same time.
>> Row "R1" has presence in nodes - N1, N2 (since RF is 2)
>> Does Cassandra ensure atomicity & isolation across replicas/partition for
>> a particular row? If so , then how does it get handled does Cassandra
>> follows 2 Phase commit txn for a row or Cassandra uses distributed lock for
>> a row?
>>
>> Thanks,
>> Rajesh
>>
>>
>>
>


Re: row level atomicity and isolation

2018-05-16 Thread kurt greaves
Atomicity and isolation are only guaranteed within a replica. If you have
multiple concurrent requests across replicas last timestamp will win. You
can get better isolation using LWT which uses paxos under the hood.

On 16 May 2018 at 08:55, Rajesh Kishore  wrote:

> Hi,
>
> I am just curious to know when Cassandra doc says the atomicity and
> isolation is guaranteed for a row.
> Does it mean, two requests updating a row- "R1" at different replica will
> be candidate for atomicity and isolation?
>
> For instance , I have a setup where RF is 2
> I have a client application where two requests of updating a particular
> row #R1 goes to two coordinator nodes at same time.
> Row "R1" has presence in nodes - N1, N2 (since RF is 2)
> Does Cassandra ensure atomicity & isolation across replicas/partition for
> a particular row? If so , then how does it get handled does Cassandra
> follows 2 Phase commit txn for a row or Cassandra uses distributed lock for
> a row?
>
> Thanks,
> Rajesh
>
>
>


Cassandra java driver linux encoding issue

2018-05-16 Thread rami dabbah
Hi,

I am trying to query text filed from Cassandra using java driver see code
below. In windows it is working fine but in linux i am getting ??
instead of Chines characteres


Code:

ResultSet shopsRS =
this.cassandraDAO.getshopsFromScanRawByScanId(cassandraSession,"scan_raw",scanid);
String record = null;
for (Row row : shopsRS){
try {
pProtocol.addEvent(new
BaseEvent(BaseEvent.LEVEL_ERROR,"Charset.defaultCharset():"+Charset.defaultCharset()));
record =row.getString("raw_data");
Helper.verifyEncoding(record);
String updated_record =
Helper.addAttributeToJsonString(pProtocol,row.getString("raw_data"),CommonVars.AUX_DATA,CommonVars.AUX_DATA_BATCH_ID,batchId);
Helper.verifyEncoding(updated_record);
producer.sendMessage( updated_record);
counter++;
} catch (IOException e) {
pProtocol.addEvent(new BaseEvent(BaseEvent.LEVEL_ERROR,"Could not send
Message: "));
e.printStackTrace();
}



example text:

"details_product_name":"佛罗伦萨万豪AC酒店(AC Hotel Firenze)|"


-- 
Rami Dabbah

Java Professional.


Re: Suggestions for migrating data from cassandra

2018-05-16 Thread Jing Meng
We would try migration for some small keyspaces (with data of serveral
gigabytes across a dc) first,
but ultimately migration for several large keyspaces with data size ranged
from 100G to 5T, some tables having >1T data, would be scheduled too.

As for StreamSets/Talend, personally I doubt if using that would be
appropriate at our company, as manpower is pretty restricted for this
migration.

Arbab's answer actually resolved my initial concern, now trying to play
with spark-connector.

Thanks for all your replies, much appreciated!

2018-05-16 5:35 GMT+08:00 Joseph Arriola :

> Hi Jing.
>
> How much information do you need to migrate? in volume and number of
> tables?
>
> With Spark could you do the follow:
>
>- Read the data and export directly to MySQL.
>- Read the data and export to csv files and after load to MySQL.
>
>
> Could you use other paths such as:
>
>- StreamSets
>- Talend Open Studio
>- Kafka Streams.
>
>
>
>
> 2018-05-15 4:59 GMT-06:00 Jing Meng :
>
>> Hi guys, for some historical reason, our cassandra cluster is currently
>> overloaded and operating on that somehow becomes a nightmare. Anyway,
>> (sadly) we're planning to migrate cassandra data back to mysql...
>>
>> So we're not quite clear how to migrating the historical data from
>> cassandra.
>>
>> While as I know there is the COPY command, I wonder if it works in
>> product env where more than hundreds gigabytes data are present. And, if it
>> does, would it impact server performance significantly?
>>
>> Apart from that, I know spark-connector can be used to scan data from c*
>> cluster, but I'm not that familiar with spark and still not sure whether
>> write data to mysql database can be done naturally with spark-connector.
>>
>> Are there any suggestions/best-practice/read-materials doing this?
>>
>> Thanks!
>>
>
>


row level atomicity and isolation

2018-05-16 Thread Rajesh Kishore
Hi,

I am just curious to know when Cassandra doc says the atomicity and
isolation is guaranteed for a row.
Does it mean, two requests updating a row- "R1" at different replica will
be candidate for atomicity and isolation?

For instance , I have a setup where RF is 2
I have a client application where two requests of updating a particular row
#R1 goes to two coordinator nodes at same time.
Row "R1" has presence in nodes - N1, N2 (since RF is 2)
Does Cassandra ensure atomicity & isolation across replicas/partition for a
particular row? If so , then how does it get handled does Cassandra follows
2 Phase commit txn for a row or Cassandra uses distributed lock for a row?

Thanks,
Rajesh