[ 
https://issues.apache.org/jira/browse/CASSANDRA-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959213#comment-16959213
 ] 

Jordan West commented on CASSANDRA-15364:
-----------------------------------------

I agree with [~dcapwell]'s evaluation that this reduces file listing calls 
considerably. Some code comments below. None are serious or required, so +1:

In {{LogFile#deleleFilesForRecordOfType}}:
- Unless its truly exceptional and requires throwing, I prefer the filtering 
approach to getting rid of non-present absolute paths (like the code below). If 
it is something we should throw for, consider adding a description to the 
assert or a more descriptive exception.
{code}
records.stream()
       .filter(r -> type.matches(r) && r.absolutePath.isPresent())
       .forEach(r -> absolutePaths.add(r.absolutePath.get()));
{code}
- Regarding the second for loop in this method, could it instead iterate over 
absolutePaths, which is already filtered (regardless of the suggestion above) 
for paths matched by type and having the path present in the record? This would 
save a few calls/branches, and reduce the need for the repeated assert.

In LogFile#verify, these are truly minor nits and personal style, it could be 
argued they are less efficient but I prefer the more java 8 like approach to:
  - #L160-1: {{records.forEach(r -> 
r.absolutePath.ifPresent(absolutePaths::add))}}
  - #L164-173: 
{code}
for (LogRecord record : records)
{
    List<File> existingFiles = record.absolutePath.flatMap(k -> 
Optional.ofNullable(recordFiles.get(k)))
                                                  
.orElse(Collections.emptyList());
    LogFile.verifyRecord(record, existingFiles);
}
{code}

> Avoid over scanning data directories in LogFile.verify()
> --------------------------------------------------------
>
>                 Key: CASSANDRA-15364
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15364
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We currently list the data directory for every {{REMOVE}} record in the file 
> in {{LogFile.verify()}} - this can get very expensive during startup when we 
> call {{LogTransaction.removeUnfinishedLeftovers()}}. In 
> {{LogRecord.getExistingFiles(Set<String> absoluteFilePaths)}} we also fully 
> parse the file name of the sstables found, here we only need to prefix match.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to