date:20110809

[jira] [Created] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Boris Yen (JIRA)

Enormous counter 
-

 Key: CASSANDRA-3006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.3
 Environment: ubuntu 10.04
Reporter: Boris Yen


I have two-node cluster with the following keyspace and column family settings.

Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]

Keyspace: test:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [datacenter1:2]
  Column Families:
ColumnFamily: testCounter (Super)
"APP status information."
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: 
org.apache.cassandra.db.marshal.CounterColumnType
  Columns sorted by: 
org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: true
  Built indexes: []

Then, I use a test program based on hector to add a counter column 
(testCounter[sc][column]) 1000 times. In the middle the adding process, I 
intentional shut down the node 172.17.19.152. In addition to that, the test 
program is smart enough to switch the consistency level from Quorum to One, so 
that the following adding actions would not fail. 

After all the adding actions are done, I start the cassandra on 172.17.19.152, 
and I use cassandra-cli to check if the counter is correct on both nodes, and I 
got a result 1001 which should be reasonable because hector will retry once. 
However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 
172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. 
Then, I check the counter again, this time I got a result 481387 which is so 
wrong.

I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or 
before also. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Boris Yen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081545#comment-13081545
 ] 

Boris Yen commented on CASSANDRA-3006:
--

I forgot the mention that the counter is out of sync between these two nodes, 
one shows 481387 and the other one shows 20706.

> Enormous counter 
> -
>
> Key: CASSANDRA-3006
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.3
> Environment: ubuntu 10.04
>Reporter: Boris Yen
>
> I have two-node cluster with the following keyspace and column family 
> settings.
> Cluster Information:
>Snitch: org.apache.cassandra.locator.SimpleSnitch
>Partitioner: org.apache.cassandra.dht.RandomPartitioner
>Schema versions: 
>   63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]
> Keyspace: test:
>   Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
>   Durable Writes: true
> Options: [datacenter1:2]
>   Column Families:
> ColumnFamily: testCounter (Super)
> "APP status information."
>   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>   Default column value validator: 
> org.apache.cassandra.db.marshal.CounterColumnType
>   Columns sorted by: 
> org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
>   Row cache size / save period in seconds: 0.0/0
>   Key cache size / save period in seconds: 20.0/14400
>   Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
>   GC grace seconds: 864000
>   Compaction min/max thresholds: 4/32
>   Read repair chance: 1.0
>   Replicate on write: true
>   Built indexes: []
> Then, I use a test program based on hector to add a counter column 
> (testCounter[sc][column]) 1000 times. In the middle the adding process, I 
> intentional shut down the node 172.17.19.152. In addition to that, the test 
> program is smart enough to switch the consistency level from Quorum to One, 
> so that the following adding actions would not fail. 
> After all the adding actions are done, I start the cassandra on 
> 172.17.19.152, and I use cassandra-cli to check if the counter is correct on 
> both nodes, and I got a result 1001 which should be reasonable because hector 
> will retry once. However, when I shut down 172.17.19.151 and after 
> 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra 
> on 172.17.19.151 again. Then, I check the counter again, this time I got a 
> result 481387 which is so wrong.
> I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or 
> before also. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Boris Yen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Yen updated CASSANDRA-3006:
-

Description: 
I have two-node cluster with the following keyspace and column family settings.

Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]

Keyspace: test:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [datacenter1:2]
  Column Families:
ColumnFamily: testCounter (Super)
"APP status information."
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: 
org.apache.cassandra.db.marshal.CounterColumnType
  Columns sorted by: 
org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: true
  Built indexes: []

Then, I use a test program based on hector to add a counter column 
(testCounter[sc][column]) 1000 times. In the middle the adding process, I 
intentional shut down the node 172.17.19.152. In addition to that, the test 
program is smart enough to switch the consistency level from Quorum to One, so 
that the following adding actions would not fail. 

After all the adding actions are done, I start the cassandra on 172.17.19.152, 
and I use cassandra-cli to check if the counter is correct on both nodes, and I 
got a result 1001 which should be reasonable because hector will retry once. 
However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 
172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. 
Then, I check the counter again, this time I got a result 481387 which is so 
wrong.

I use 0.8.3 to reproduce this bug, but I think this also happens on 0.8.2 or 
before also. 

  was:
I have two-node cluster with the following keyspace and column family settings.

Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]

Keyspace: test:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [datacenter1:2]
  Column Families:
ColumnFamily: testCounter (Super)
"APP status information."
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: 
org.apache.cassandra.db.marshal.CounterColumnType
  Columns sorted by: 
org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: true
  Built indexes: []

Then, I use a test program based on hector to add a counter column 
(testCounter[sc][column]) 1000 times. In the middle the adding process, I 
intentional shut down the node 172.17.19.152. In addition to that, the test 
program is smart enough to switch the consistency level from Quorum to One, so 
that the following adding actions would not fail. 

After all the adding actions are done, I start the cassandra on 172.17.19.152, 
and I use cassandra-cli to check if the counter is correct on both nodes, and I 
got a result 1001 which should be reasonable because hector will retry once. 
However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 
172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. 
Then, I check the counter again, this time I got a result 481387 which is so 
wrong.

I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or 
before also. 


> Enormous counter 
> -
>
> Key: CASSANDRA-3006
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.3
> Environment: ubuntu 10.04
>Reporter: Boris Yen
>
> I have two-node cluster with the following keyspace and column family 
> settings.
> Cluster Information:
>Snitch: org.apache.cassandra.locator.SimpleSnitch
>Partitioner: org.apache.cassandra.dht.RandomPartitioner
>Schema versions: 
>   63fda700-c243-11e0--2d

[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-08-09 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2843:


Attachment: 2843_h.patch

bq. the IColumnMap name when it does not implement Map interface, and some 
things it has in common with Map (iteration) it changes semantics of (iterating 
values instead of keys). not sure what to use instead though, since we already 
have an IColumnContainer. Maybe ISortedColumns?

Yeah, I'm not sure I have a better name either, maybe ISortedColumnHolder, but 
not sure it's better than ISortedColumns so attached rebased patch simply 
rename ColumnMap -> SortedColumns

bq. TSCM and ALCM extending instead of wrapping CSLM/AL, respectively

The idea was to save one object creation. I admit this is probably not a huge 
deal, but it felt that in this case it was no big deal to extend instead of 
wrapping either, so felt like worth "optimizing". I still stand by that choice 
but I have no good argument against the criticism that it is possibly premature.

bq. unrelated reformatting

If we're talking about the ones in SuperColumn.java, sorry, I mistakenly forced 
re-indentation on the file which rewrote the tab to spaces. New patch keeps the 
old formatting.  I'd mention that there is also a few places where I've rewrote 
cf.getSortedColumns().iterator() to cf.iterator(), which is arguably a bit 
gratuitous for this patch, but I figured this avoids creating a new Collection 
in the case of CLSM and there's not so many occurrences.


> better performance on long row read
> ---
>
> Key: CASSANDRA-2843
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Yang Yang
> Fix For: 1.0
>
> Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, 2843_h.patch, 
> fix.diff, microBenchmark.patch, patch_timing, std_timing
>
>
> currently if a row contains > 1000 columns, the run time becomes considerably 
> slow (my test of 
> a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
> 40 bytes in value, is about 16ms.
> this is all running in memory, no disk read is involved.
> through debugging we can find
> most of this time is spent on 
> [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
> [Wall Time]  
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
> ColumnFamily)
> [Wall Time]  
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
> ColumnFamily)
> [Wall Time]  
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
> int, ColumnFamily)
> [Wall Time]  
> org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
>  Iterator, int)
> [Wall Time]  
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
>  Iterator, int)
> [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
> ColumnFamily.addColumn() is slow because it inserts into an internal 
> concurrentSkipListMap() that maps column names to values.
> this structure is slow for two reasons: it needs to do synchronization; it 
> needs to maintain a more complex structure of map.
> but if we look at the whole read path, thrift already defines the read output 
> to be List so it does not make sense to use a luxury map 
> data structure in the interium and finally convert it to a list. on the 
> synchronization side, since the return CF is never going to be 
> shared/modified by other threads, we know the access is always single thread, 
> so no synchronization is needed.
> but these 2 features are indeed needed for ColumnFamily in other cases, 
> particularly write. so we can provide a different ColumnFamily to 
> CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
> creates the standard ColumnFamily, but take a provided returnCF, whose cost 
> is much cheaper.
> the provided patch is for demonstration now, will work further once we agree 
> on the general direction. 
> CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
> provided. the main work is to let the FastColumnFamily use an array  for 
> internal storage. at first I used binary search to insert new columns in 
> addColumn(), but later I found that even this is not necessary, since all 
> calling scenarios of ColumnFamily.addColumn() has an invariant that the 
> inserted columns come in sorted order (I still have an issue to resolve 
> descending or ascending  now, but ascending works). so the current logic is 
> simply to compare the new column against the end column in the array, if 
> names not equal, append, if equal, reconcile.
> slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
> flavors of the method, one accepting a returnCF. but we could definitely

[jira] [Created] (CASSANDRA-3007) NullPointerException in MessagingService.java:420

2011-08-09 Thread Viliam Holub (JIRA)

NullPointerException in MessagingService.java:420
-

 Key: CASSANDRA-3007
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3007
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.3
 Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 
05:15:26 UTC 2010 x86_64 GNU/Linux
java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1)
OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Viliam Holub
Priority: Minor


I'm getting large quantity of exceptions during streaming. It is always in 
MessagingService.java:420. The streaming appears to be blocked.

 INFO 10:11:14,734 Streaming to /10.235.77.27
ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420)
at 
org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176)
at 
org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148)
at 
org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-1717:
---

Attachment: CASSANDRA-1717-v2.patch

bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit 
tests btw).

  Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

  Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to 
checksum the uncompressed data. The advantage of checksumming compressed data 
is the speed (less data to checksum), but checksumming the uncompressed data 
would be a little bit safer. In particular, it would prevent us from messing up 
in the decompression (and we don't have to trust the compression algorithm, not 
that I don't trust Snappy, but...). This is a clearly a trade-off that we have 
to make, but I admit that my personal preference would lean towards safety (in 
particular, I know that checksumming the uncompressed data give a bit more 
safety, I don't know what is our exact gain quantitatively with checksumming 
compressed data). On the other side, checksumming the uncompressed data would 
likely mean that a good part of the bitrot would result in a decompression 
error rather than a checksum error, which is maybe less convenient from the 
implementation point of view. So I don't know, I guess I'm thinking aloud to 
have other's opinions more than anything else.

  Checksum is moved to the original data.
 
bq. Let's add some unit tests. At least it's relatively easy to write a few 
blocks, switch one bit in the resulting file, and checking this is caught at 
read time (or better, do that multiple time changing a different bit each time).

  Test was added to CompressedRandomAccessReaderTest.

As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of 
java CRC32. In particular, it seems they have been able to close to double the 
speed of the CRC32, with a solution that seems fairly simple to me. It would be 
ok to use java native CRC32 and leave the improvement to another ticket, but 
quite frankly if it is that simple and since the hadoop guys have done all the 
hard work for us, I say we start with the efficient version directly.

  As decided previously this will be a matter of the separate ticket.

Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f)

> Cassandra cannot detect corrupt-but-readable column data
> 
>
> Key: CASSANDRA-1717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Pavel Yaskevich
> Fix For: 1.0
>
> Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
> checksums.txt
>
>
> Most corruptions of on-disk data due to bitrot render the column (or row) 
> unreadable, so the data can be replaced by read repair or anti-entropy.  But 
> if the corruption keeps column data readable we do not detect it, and if it 
> corrupts to a higher timestamp value can even resist being overwritten by 
> newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081569#comment-13081569
]

Pavel Yaskevich edited comment on CASSANDRA-1717 at 8/9/11 11:25 AM:
-

bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit
tests btw).

Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int
internally (getValue() returns a long only because CRC32 implements the
interface Checksum that require that).

Lets leave that to the ticket for CRC optimization which will allow us to
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to
checksum the uncompressed data. The advantage of checksumming compressed data
is the speed (less data to checksum), but checksumming the uncompressed data
would be a little bit safer. In particular, it would prevent us from messing up
in the decompression (and we don't have to trust the compression algorithm, not
that I don't trust Snappy, but...). This is a clearly a trade-off that we have
to make, but I admit that my personal preference would lean towards safety (in
particular, I know that checksumming the uncompressed data give a bit more
safety, I don't know what is our exact gain quantitatively with checksumming
compressed data). On the other side, checksumming the uncompressed data would
likely mean that a good part of the bitrot would result in a decompression
error rather than a checksum error, which is maybe less convenient from the
implementation point of view. So I don't know, I guess I'm thinking aloud to
have other's opinions more than anything else.

Checksum is moved to the original data.

bq. Let's add some unit tests. At least it's relatively easy to write a few
blocks, switch one bit in the resulting file, and checking this is caught at
read time (or better, do that multiple time changing a different bit each time).

Test was added to CompressedRandomAccessReaderTest.

bq. As Todd noted, HADOOP-6148 contains a bunch of discussions on the
efficiency of java CRC32. In particular, it seems they have been able to close
to double the speed of the CRC32, with a solution that seems fairly simple to
me. It would be ok to use java native CRC32 and leave the improvement to
another ticket, but quite frankly if it is that simple and since the hadoop
guys have done all the hard work for us, I say we start with the efficient
version directly.

As decided previously this will be a matter of the separate ticket.

Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f)

was (Author: xedin):
bq. CSW.flushData() forgot to reset the checksum (this is caught by the
unit tests btw).

Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int
internally (getValue() returns a long only because CRC32 implements the
interface Checksum that require that).

Lets leave that to the ticket for CRC optimization which will allow us to
modify that system-wide.

Checksum is moved to the original data.

Test was added to CompressedRandomAccessReaderTest.

As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of
java CRC32. In particular, it seems they have been able to close to double the
speed of the CRC32, with a solution that seems fairly simple to me. It would be
ok to use java native CRC32 and leave the improvement to another ticket, but
quite

[jira] [Issue Comment Edited] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081569#comment-13081569
]

Pavel Yaskevich edited comment on CASSANDRA-1717 at 8/9/11 11:29 AM:
-

bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit
tests btw).

Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int
internally (getValue() returns a long only because CRC32 implements the
interface Checksum that require that).

Lets leave that to the ticket for CRC optimization which will allow us to
modify that system-wide.

It checksums original (non-compressed) data and stores checksum at the end of
the compressed chunk, reader makes a checksum check after decompression.

Test was added to CompressedRandomAccessReaderTest.

As decided previously this will be a matter of the separate ticket.

Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f)

was (Author: xedin):
bq. CSW.flushData() forgot to reset the checksum (this is caught by the
unit tests btw).

Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int
internally (getValue() returns a long only because CRC32 implements the
interface Checksum that require that).

Lets leave that to the ticket for CRC optimization which will allow us to
modify that system-wide.

Checksum is moved to the original data.

Test was added to CompressedRandomAccessReaderTest.

1 2 >

1 - 100 of 105 matches

Mail list logo