[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945632#comment-15945632 ]
Ishan Chattopadhyaya commented on LUCENE-7745: ---------------------------------------------- Hi Vikash, Regarding licensing issue: The work done in this project would be exploratory. That code won't necessarily go into Lucene. When we are at a point where we see clear benefits from the work done here, we would then have to explore all aspects of productionizing it (including licensing). Regarding next steps: {quote} BooleanScorer calls a lot of classes, e.g. the BM25 similarity or TF-IDF to do the calculation that could possibly be parallelized. {quote} # First, understand how BooleanScorer calls these similarity classes and does the scoring. There are unit tests in Lucene that can help you get there. This might help: https://wiki.apache.org/lucene-java/HowToContribute # Write a standalone CUDA/OpenCL project that does the same processing on the GPU. # Benchmark the speed of doing so on GPU vs. speed observed when doing the same through the BooleanScorer. Preferably, on a large resultset. Include time for copying results and scores in and out of the device memory from/to the main memory. # Optimize step 2, if possible. Once this is achieved (which in itself could be a sufficient GSoC project), one can have stretch goals to try out other parts of Lucene to optimize (e.g. spatial search). > Explore GPU acceleration > ------------------------ > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org