[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-23 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375821309
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-23 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375770793
 
 
   > If we were using stock ConcurrentHashMap it'd be less risky, but since the 
hashmap is our code too, it's better to err on the safe side.
   
   `ConcurrentLongHashMap` is already everywhere in the code base :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-23 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375752857
 
 
   @ivankelly 
   
   > it seems pointless to have locks around a concurrent hashmap
   
   well. depends on how you see stuffs. concurrent hashmap is section based 
implementation, it has better parallelism and less contention on accessing 
stuffs. especially most of the time the map is protected under a read lock and 
only be blocked when the map is going to modified.
   
   > I've made a PR into your branch with a lockless approach.
   
   I have thought about lockless approach but I didn't do it. A lockless 
approach moves the mutation out of the protected of a write lock, which can 
potentially make it exposed to race conditions that we don't think of. This 
area has been very tricky based on the PRs and discussions we had around it. It 
is a too-risky change for me. That's also the reason I explained above and the 
reason I don't feel comfortable about your lockless change.
   
   My immediate concern is only "don't do IO under one lock". I am much 
comfortable to keep the read/write lock until there is an evidence showing 
read/write lock is a problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-23 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375587424
 
 
   > Ah yes. At minimum we should be checking again for existence inside the 
write lock.
   
   this is one of the change I made here, checking the existence.
   
   > But again, my question is whether this has been observed to be a problem?
   
   no at benchmark now. I hit the race condition then I noticed this behavior 
that file lookup is under a giant lock, which I would like to remove. because 
we have been working hard at SortedLedgerStorage to avoid doing any IO under 
lock. This FileInfoBackingCache unfortunately added this behavior back. So I 
have a strong concern about this and want to move the I/O operation out. 
However I don't want to change the locking behavior on getting FileInfo, 
reasons explained as below.
   
   > if we want to get rid of the locking, we should get rid of the big lock 
completely, which shouldn't be too hard with the concurrent map.
   
   a lot of race conditions can come around when competing on getting the 
fileinfo. so check-existence and put is much safer to happen under the write 
lock. and they are just memory operation, which is fine to happen under the 
write lock.
   
   However find the directory for a ledger is an I/O operation, it is 
expensive, unpredictable than memory operation. so it should not happen under 
the lock.
   
   so the current approach is the most comfortable and safest approach I can 
think of. However if you really feel strong about this, feel free to start your 
approach.
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-23 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375587424
 
 
   > Ah yes. At minimum we should be checking again for existence inside the 
write lock.
   
   this is one of the change I made here, checking the existence.
   
   > But again, my question is whether this has been observed to be a problem?
   
   no at benchmark now. I hit the race condition then I noticed this behavior 
that file lookup is under a giant lock, which I would like to remove. because 
we have been working hard at SortedLedgerStorage to avoid doing any IO under 
lock. This FileInfoBackingCache unfortunately added this behavior back. So I 
have a strong concern about this and want to move the I/O operation out. 
However I don't want to change the locking behavior on getting FileInfo, 
reasons explained as below.
   
   > if we want to get rid of the locking, we should get rid of the big lock 
completely, which shouldn't be too hard with the concurrent map.
   
   a lot of race conditions can come around when competing on getting the 
fileinfo. so check-existence and put is much safer to happen under the write 
lock. and they are just memory operation, which is fine to happen under the 
write lock.
   
   However check if the file exists or not is an I/O operation, it is 
expensive, unpredictable than memory operation. so it should not happen under 
the lock.
   
   so the current approach is the most comfortable and safest approach I can 
think of. However if you really feel strong about this, feel free to start your 
approach.
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-22 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375420639
 
 
   > Could you give a sequence of events for how the other race occurs?
   
   I am using your diagram and using the existing logic
   
   ```
 | Read Fi CacheLoader   | Write Fi CacheLoader 
   |
 | == | 
== |
 | calls loadFileInfo | 
   |
 | - checks fileInfos under read lock | 
   |
 || 
   |
 || calls loadFileInfo  
   |
 || - checks fileInfos under read 
lock |
 || - create a file info and put 
into fileInfos under write lock |
 | - create another file  info and put it  |
|
 | into fileInfos under write lock| 
   |
   ```
   
   The Read Fi CacheLoader will overwrite the fileinfo created by Write Fi 
CacheLoader.
   
   
   > Has the locking shown to be a problem in benching?
   
   it is a giant write lock. so every thread is blocking waiting for file 
lookup (which is done by the file loader).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-22 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375420639
 
 
   > Could you give a sequence of events for how the other race occurs?
   
   I am using your diagram and using the existing logic
   
   ```
 | Read Fi CacheLoader   | Write Fi CacheLoader 
   |
 | == | 
== |
 | calls loadFileInfo | 
   |
 | - checks fileInfos under read lock | 
   |
 || 
   |
 || calls loadFileInfo  
   |
 || - checks fileInfos under read 
lock |
 || - create a file info and put 
into fileInfos under write lock |
 | - create another file  info and put it  |
|
 | into fileInfos under write lock| 
   |
   ```
   
   
   > Has the locking shown to be a problem in benching?
   
   it is a giant write lock. so every thread is blocking waiting for file 
lookup (which is done by the file loader).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-22 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375420639
 
 
   > Could you give a sequence of events for how the other race occurs?
   
   I am using your diagram and using the existing logic
   
   ```
 | Read Fi CacheLoader   | Write Fi CacheLoader 
   |
 | == | 
== |
 | calls loadFileInfo | 
   |
 | - checks fileInfos under read lock | 
   |
 || 
   |
 || calls loadFileInfo  
   |
 || - checks fileInfos under read 
lock |
 || - create a file info and put 
into fileInfos under write lock |
 | - create another file ||
 |info and put it into||
 |  fileInfos under write lock ||
   ```
   
   
   > Has the locking shown to be a problem in benching?
   
   it is a giant write lock. so every thread is blocking waiting for file 
lookup (which is done by the file loader).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-22 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375420639
 
 
   > Could you give a sequence of events for how the other race occurs?
   
   I am using your diagram and using the existing logic
   
   ```
 | Read Fi CacheLoader   | Write Fi CacheLoader 
   |
 | == | 
== |
 | calls loadFileInfo | 
   |
 | - checks fileInfos under read lock | 
   |
 |||
 || calls loadFileInfo  
   |
 || - checks fileInfos under read 
lock |
 || - create a file info and put 
into fileInfos under write lock |
 | - create another file info and put it into fileInfos under write lock |  
  |
   ```
   
   
   > Has the locking shown to be a problem in benching?
   
   it is a giant write lock. so every thread is blocking waiting for file 
lookup (which is done by the file loader).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #1284: Improve FileInfoBackingCache

2018-03-22 Thread GitBox
sijie commented on issue #1284: Improve FileInfoBackingCache
URL: https://github.com/apache/bookkeeper/pull/1284#issuecomment-375420639
 
 
   > Could you give a sequence of events for how the other race occurs?
   
   I am using your diagram and using the existing logic
   
   ```
 | Read Fi CacheLoader   | Write Fi CacheLoader 
   |
 | == | 
== |
 | calls loadFileInfo | 
   |
 | - checks fileInfos under read lock | 
   |
 |||
 || calls loadFileInfo  
   |
 || - checks fileInfos under read 
lock |
 || - create a file info and put 
into fileInfos under write lock |
 | - create another file | |
 |info and put it into fileInfos under write lock | 
   |
   ```
   
   
   > Has the locking shown to be a problem in benching?
   
   it is a giant write lock. so every thread is blocking waiting for file 
lookup (which is done by the file loader).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services