[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877934#comment-16877934 ] Rinka Singh commented on LUCENE-7745: - {quote}The basic idea is to compute sub-histograms in each thread block with each thread block accumulating into the local memory. Then, when each thread block finishes its workload, it atomically adds the result to global memory, reducing the overall amount of traffic to global memory.To increase throughput and reduce shared memory contention, the main contribution here is that they actually use R "replicated" sub-histograms in each thread block, and they offset them so that bin 0 of the 1st histogram falls into a different memory bank than bin 0 of the 2nd histogram, and so on for R histograms. Essentially, it improves throughput in the degenerate case where multiple threads are trying to accumulate the same histogram bin at the same time. {quote} So here's what I've done/am doing: I have a basic histogramming (including eliminating stop words) working on a single GPU (I have an old Quadro 2000 with 1 GB memory) - I've tested it for a 5MB (text file) and it seems to be working OK. The following is how I'm implementing it - briefly. Read a file in from command line (linux executable) into the GPU * convert the stream to words, chunk them into blocks * eliminate the stop words * sort/merge (including word-count) everything first inside a block and then across blocks - I came up with my own sort - haven't had the time to explore the parallel sorts out there * This results in a sorted histogram is held in multiple blocks in the GPU. The advantages of this approach (to my mind) are: * i can scale up use the entire GPU memory. My guess is I can create and manage an 8-10 GB index in a V100 (it has 32GB) - like I said, I've only tested with a 5 MB text file so far. * Easy to add fresh data into the existing histogram. All I need to do is create new blocks and sort/merge them all. * I'm guessing this should make it easy to implement scaling across GPUs which means on a multi-GPU machine, I can scale to the almost the number of GPUs there and then of course one can setup a cluster of such machines... This is far in the future though... * The sort is kept separate so we can experiment with various sorts and see which one performs best. The issues are: * It is currently horrendously slow (I use global memory all the way and no optimization). Well OK much too slow for my liking (I went over to nVidia's office and tested it on a K80 and it was just twice as fast as my GPU). I'm currently trying to implement a shared memory version (and a few other tweaks) that should speed it up. * I have yet to do comparisons with the histogramming tools out there and so cannot say how much better it is. Once I have the basic inverted index in place, I'll reach out to you all for the testing. * It is still a bit fragile - I'm still finding bugs as I test but the basic works. Currently in process: * code is modified for (some) performance. Am debugging/testing - it will take a while. As of now, I feel good about what I've done but I won't know till I test for performance. * Need to add ability to handle multiple files (I think I will postpone this as one can always cat the files together and pass it in - that is a pretty simple script that can be wrapped around the executable). * Need to create inverted index. * we'll worry about searching on the index later but that should be pretty trivial - well actually nothing is trivial here. {quote}Re: efficient histogram implementation in CUDA If it helps, [this approach|https://scholar.google.com/scholar?cluster=4154868272073145366&hl=en&as_sdt=0,3] has been good for a balance between GPU performance and ease of implementation for work I've done in the past. If academic paywalls block you for all those results, it looks to also be available (presumably by the authors) on [researchgate|https://www.researchgate.net/publication/256674650_An_optimized_approach_to_histogram_computation_on_GPU] {quote} Took a quick look - they are all priced products. I will take a look at researchgate sometime. I apologize but I may not be very responsive in the next month or so as we are in the middle of a release at work and also my night time job (this). > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if comp
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877165#comment-16877165 ] Joshua Mack commented on LUCENE-7745: - Sounds good! Re: efficient histogram implementation in CUDA If it helps, [this approach|https://scholar.google.com/scholar?cluster=4154868272073145366&hl=en&as_sdt=0,3] has been good for a balance between GPU performance and ease of implementation for work I've done in the past. If academic paywalls block you for all those results, it looks to also be available (presumably by the authors) on [researchgate|https://www.researchgate.net/publication/256674650_An_optimized_approach_to_histogram_computation_on_GPU] The basic idea is to compute sub-histograms in each thread block with each thread block accumulating into the local memory. Then, when each thread block finishes its workload, it atomically adds the result to global memory, reducing the overall amount of traffic to global memory. To increase throughput and reduce shared memory contention, the main contribution here is that they actually use R "replicated" sub-histograms in each thread block, and they offset them so that bin 0 of the 1st histogram falls into a different memory bank than bin 0 of the 2nd histogram, and so on for R histograms. Essentially, it improves throughput in the degenerate case where multiple threads are trying to accumulate the same histogram bin at the same time. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863938#comment-16863938 ] Rinka Singh commented on LUCENE-7745: - Hi [~mackncheesiest], All, A quick update. I've been going really slow (sorry 'bout that). My day job has consumed a lot of my time. Also, what I have working is histogramming (on text files) on GPUs - the problem with that is that it is horrendously slow - I use the GPU global memory all the way (it is just about 4-5 times faster than a CPU) instead of sorting in local memory. I've been trying to accelerate that before converting it into an inverted index. Nearly there :) you know how that is - the almost there syndrome... Once I get it done, I'll check it into my github. Here's the lessons I learned in my journey: # Do all the decision making in the CPU. See if parallelization can substitute for decision making - you need to think parallelization/optimization as part of design not as an after-thought - this is counter to what everyone says about optimization. The reason is there could be SIGNIFICANT design changes. # Do as fine grained parallelism as possible. Don't think - one cpu-thread == one gpu-thread. think as parallel as possible. # The best metaphor I found (of working with GPUs) - think of it as an embedded board attached to your machine and you move data to and fro from the board, debug on the board. Dump all parallel processing on the board and sequential on the CPU. # Read the nVidia Manuals (they are your best bet). I figured it is better to stay with CUDA (as against OpenCL) given the wealth of CUDA info and support out there... # Writing code: ## explicitly think about cache memory (that's your shared memory) and registers (local memory) and manage them. This is COMPLETELY different from writing CPU code - the compiler does this for you. ## Try to used const, shared and register memory as much as possible. Avoid __syncthreads () if you can. ## Here's where the dragons lie... # Engineer productivity is roughly 1/10th the normal productivity (And I mean writing C/C++ not python). I've written and thrown away code umpteen times - something that I just wouldn't need to do when writing standard code. Having said all this, :) I've a bunch of limitations that a regular software engineer will not have and have been struggling to get over them. I've been a manager for way too long and find it really difficult to focus on just one thing (the standard ADHD that most managers eventually develop). Also, I WAS a C programmer, loong ago - no, not even C++ and I just haven't had the bandwidth to pick C++ up and then let's not even talk about my day job pressures - I do this for an hour or two at night (sigh)... I will put out everything I've done once I've crossed a milestone (a working accelerated histogram). Then will modify that to do inverted indexing. Hope this helps... In the meantime, if you want me to review documents/design/thoughts/anything, please feel free to mail them to me at: rinka (dot) singh (at) gmail. At least ping me - I really don't look at the Apache messages and would probably miss something... > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794786#comment-16794786 ] Joshua Mack commented on LUCENE-7745: - Hello all, There's a chance that I may end up being a lucky graduate student tasked with handling something similar to this a few months down the line with application to a largely textual Kafka+Logstash+Elastic pipeline [~ichattopadhyaya], I've noticed that the Github links above/referenced in the Activate 2018 talk now lead to 404s – I understand that it's not intended to be taken as the defacto method by which functionality like this should be added to Lucene, but having a baseline to look into with some of our investigations would certainly be helpful and could save us (myself) some time. If you'd be willing to rehost somewhere or provide updated results, I would certainly appreciate it Similarly, [~rinka], coming from a naive outsider, I think your plan sounds good, and if you have any preliminary investigations you've performed, I would appreciate being able to learn from your lessons thus far. From my perspective, I have experience with GPUs and Solr separately, but I've never dealt with them together (or with Lucene directly for that matter) Finally, we're still in a very investigative phase and I'm certainly not the PI, but assuming this ends up being relevant to our project, I'm sure we would be happy to contribute back whatever we end up doing as a part of the project > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722717#comment-16722717 ] Rinka Singh commented on LUCENE-7745: - Thank you. As a first step let me come up with a GPU based index builder and we can look at query handling on the GPU as a second step. :-) I AM going to be slooow - my apologies but I'm interested enough to put effort into this and will do my best. Here's what I'll do as a first step: Develop a stand alone executable (we can figure out modifications to directly use in Lucene as step 1.1) that will: a. Read multiple files (command line) and come up with an inverted index b. write the inverted index to stdout (I can generate a lucene index as step 1.1) c. will handle a stop-words file as a command line param d. Will work on one GPU+1 thread of the CPU (I'll keep multi-GPU and multi threading to use all CPUs in mind but implementing that will be a separate step altogether). Goal: Look at speed difference between an Index generated on the CPU vs GPU for just this. We can build from there... Thoughts please... > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16703835#comment-16703835 ] Adrien Grand commented on LUCENE-7745: -- bq. How critical is the inverted index to the user experience? Completely: almost all queries run on the inverted index. Unlike other datastores that run queries via linear scans and allow to speed things up by building indices, Lucene only enables querying vian an index. bq. What happens if the inverted index is speeded up? Then most queries get a speed up too. bq. How many AWS instances would usually be used for searching through ~140GB sized inverted index Hard to tell, it depends on your search load, how expensive queries are, etc. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16702135#comment-16702135 ] Rinka Singh commented on LUCENE-7745: - A few questions. How critical is the inverted index to the user experience? What happens if the inverted index is speeded up? > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701990#comment-16701990 ] Rinka Singh commented on LUCENE-7745: - [~jpountz] {quote}(Unrelated to your comment Rinka, but seeing activity on this issue reminded me that I wanted to share something) There are limited use-cases for GPU accelelation in Lucene due to the fact that query processing is full of branches, especially since we added support for impacts and WAND.{quote} While Yes branches do impact the performance, well designed (GPU) code will consist of a combo of both CPU (the decision making part) and GPU code. For example, I wrote a histogram as a test case that saw SIGNIFICANT acceleration and I also identified further code areas that can be improved. I'm fairly sure (gut feel), I can squeeze out a 40-50x kind of improvement at the very least on a mid-sized GPU (given the time etc.,). I think things will be much, much better on a high end GPU and with further scale-up on a multi-gpu system... Incidentally, this is why I want to develop a library that I can put out there for integration. {quote}That said Mike initially mentioned that BooleanScorer might be one scorer that could benefit from GPU acceleration as it scores large blocks of documents at once. I just attached a specialization of a disjunction over term queries that should make it easy to experiment with Cuda, see the TODO in the end on top of the computeScores method. {quote} Lucene is really new to me (and so is working with Apache - sorry, I am a newbie to Apache) :). Please will you put links here... > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701976#comment-16701976 ] Rinka Singh commented on LUCENE-7745: - > The code is not worth a patch right now, but will soon have something. I > shall update on the latest state > here as soon as I find myself some time (winding down from a hectic Black > Friday/Cyber Monday support schedule). Do you think I could take a look at the code, I could do a quick review and perhaps add a bit of value. I'm fine if the code is in dev state. Would you have written up something to describe what you are doing? > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701951#comment-16701951 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- bq. Your thoughts please... Thanks for your interest in this. Seems like your proposed ideas are very much inline with our approach that we're trying out as well. There are some initial experiments and results that we are performing as we speak, and I can see that there are benefits in the niche usecases that Adrien mentioned. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701941#comment-16701941 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Hi Rinka, Kishore and I made some progress on this and presented our current state of this initiative here: https://www.youtube.com/watch?v=cY_4ApOAVJQ The code is not worth a patch right now, but will soon have something. I shall update on the latest state here as soon as I find myself some time (winding down from a hectic Black Friday/Cyber Monday support schedule). > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701879#comment-16701879 ] Adrien Grand commented on LUCENE-7745: -- (Unrelated to your comment Rinka, but seeing activity on this issue reminded me that I wanted to share something) There are limited use-cases for GPU accelelation in Lucene due to the fact that query processing is full of branches, especially since we added support for impacts and WAND. That said Mike initially mentioned that BooleanScorer might be one scorer that could benefit from GPU acceleration as it scores large blocks of documents at once. I just attached a specialization of a disjunction over term queries that should make it easy to experiment with Cuda, see the TODO in the end on top of the {{computeScores}} method. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701855#comment-16701855 ] Rinka Singh commented on LUCENE-7745: - Hi everyone, I wanted to check if this issue was still open. I have been experimenting with CUDA for a bit and would love to take a stab at this. A few thoughts: * This is something I'll do over weekends and so I'm going to be horribly slow (its going to be just me on this unless you have someone working on it and I can collaborate with them) - would that be OK? * I think the right thing to do would be to build a CUDA library (C/C++), put JNI and then integrate it into Lucene. If done right then I think this library will be useful to (and be possible to integrate with) other Analytic tools. * If I get it right, then I'd love to create an OS library that other OS tools can integrate and use (Yes, I'm thinking of an OpenCL port in the future but given the tools available in CUDA and my familiarity with it...) * Licensing is not an issue as I prefer the Apache License. * Testing (especially scalability testing) will be an issue - like you said, your setups won't have GPUs but would it be possible to rent a few GPU instances on the cloud (AWS, Google)? I can do my dev testing locally as I have a GPU (its a pretty old and obsolete one but good enough for my needs) on my dev machine. * It is important to get a few users who will experiment with this. Can you guys help in having someone deploy, experiment and give feedback? * I would rather take something that is used by everyone and I'm thinking that indexing, filtering and searching is something that I would rather take up: [http://lucene.apache.org/core/7_5_0/demo/overview-summary.html#overview.description] ** These can certainly be accelerated. I think I should be able to get some acceleration out of a GPU enabled search. ** The good part of this is one would able to scale volumes almost linearly on a multi-GPU machine. ** Related to the previous point (though this is in the future). I don't have a multi-GPU setup and will not be able to develop multi-GPU versions. I'll need help in getting the infrastructure to do that. We can talk about that once a single GPU version is done. ** Yes I agree that it will be better to have a separate library / classes doing this rather than directly integrating it into Lucene's class library. This suits me too as I can develop this as a separate library that other OS components can integrate and I can package this as part of nVidia's OS libraries. * I'm open to other alternatives - I scanned the ideas above but didn't consider them as they would not bring massive value to the users and I don't really want to experiment as I know what I'm doing. * Related to the previous point, I don't know Lucene (Help!! - do I really need to?) and will need support/hand-holding in terms of reviewing the identification/interfacing/design/code etc., etc., * Finally, this IS GOING TO take time because thinking (and programming) massively parallel is completely different from writing a simple sequential search and sort. How much time, think 7-10x at least given all my constraints. If you guys like, I can write a brief (one or two paras) description of what is possible for indexing, searching, filtering (with zero knowledge of Lucene of course) to start off... Your thoughts please... > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527334#comment-16527334 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Ah, I think I wasn't clear on my intentions behind those numbers. bq. if it brings any performance - I doubt that, because the call overhead between Java and CUDA is way too high - in contrast to Postgres where all in plain C/C++ I wanted to start with those experiments just to prove to myself that there are no significant overheads or bottlenecks (as we've feared in the past) and that there can be clear benefits to be realized. I wanted to try bulk scoring, and chose the distance calculation and sorting as an example because (1) it leverages two fields, (2) it was fairly isolated & easy to try out. In practical usecases of spatial search, the spatial filtering doesn't require score calculation & sorting on the entire dataset (just those documents that are in the vicinity of the user point, filtered down by the geohash or bkd tree node); so in some sense I was trying out an absolute worst case of Lucene spatial search. Now, that I'm convinced that this overall approach works and overheads are low, I can now move on to looking at Lucene internals, maybe starting with scoring in general (BooleanScorer, for example). Other parts of Lucene/Solr that might see benefit could be streaming expressions (since they seem computation heavy), LTR re-ranking etc. Actually incorporating all these benefits into Lucene would require considerable effort, and we can open subsequent JIRAs once we've had a chance to explore them separately. Till then, I'm inclined to keep this issue as a kitchen sink for all-things-GPU, if that makes sense? > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526510#comment-16526510 ] David Smiley commented on LUCENE-7745: -- np. Oh this caught me by surprise too! I though this was about BooleanScorer or postings or something and then low and behold it's spatial -- and then I thought this is so non-obvious by the issue title. So I thought it'd do a little JIRA gardening. > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526506#comment-16526506 ] Adrien Grand commented on LUCENE-7745: -- Not sure why I confused names, I meant Ishan indeed. Sorry for that. I'll let Ishan decide how he wants to manage this issue, I'm personally fine either way, I'm mostly following. :) It just caught me by surprise since I was under the impression that we were still exploring which areas might benefit from GPU acceleration. > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526486#comment-16526486 ] David Smiley commented on LUCENE-7745: -- Mark who? You must mean Ishan? I think that if GPUs are used to accelerate different things, then they would get separate issues and not be lumped under one issue. Does that sound reasonable? Granted the problem posted started off as a bit of an umbrella ticket and perhaps the particular proposal Ishan is presenting in his most recent comment ought to go in a new issue specific to spatial.Accelerating Haversine calculations sounds way different to me than BooleanScorer stuff; no? > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526467#comment-16526467 ] Adrien Grand commented on LUCENE-7745: -- David, I'm not sure this was meant to be specific to lucene/spatial, Mark only mentioned it as a way to conduct an initial benchmark? The main thing that we identified as being a potential candidate for integration with Cuda is actually BooleanScorer (BS1, the one that does scoring in bulk) based on previous comments? > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524890#comment-16524890 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Here [0] are some very initial experiments that I ran, along with Kishore Angani, a colleague at Unbxd. 1. Generic problem: Given a result set (of document hits) and a scoring function, return a sorted list of documents along with the computed scores. 2. Specific problem: Given (up to 11M) points and associated docids, compute the distance from a given query point. Return the sorted list of documents based on these distances. 3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed library), called from JNI wrapper. Timings include copying data (scores and sorted docids) back from GPU to host system and access from Java (via DirectByteBuffer). 4. CPU implementation was based on SpatialExample [1], which is perhaps not the fastest (points fields are better, I think). 5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 1080, 11GB GDDR5 memory. Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, blue is GPU and red is CPU (Lucene). !Screenshot from 2018-06-27 15-33-37.png! [0] - https://github.com/chatman/gpu-benchmarks [1] - https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953766#comment-15953766 ] vikash commented on LUCENE-7745: oops i could not do that, i submitted my proposal by the way and if you check it now the latest edited format is the submitted version and sadly i could not change the github link, it only points to my home directory in github, but can I start working still and I shall give you the link that has my working and if it would be possible for you , you could show to the Apache Software Foundation my works, will that be ok? > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953673#comment-15953673 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- If you still haven't submitted your proposal, I have an idea for you to improve your chances. Include a link to a github repository in the application for your initial experiments. After that, you can try to build a prototype in the next few days (until assessment starts) that demonstrates that you are on the right track. This is not strictly necessary, but just throwing out an idea that might benefit you. All the best and regards! > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953579#comment-15953579 ] vikash commented on LUCENE-7745: It is my First GSOC and so it was a bit difficult for me to draft the proposal properly. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953521#comment-15953521 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Hi Vikash, I have reviewed the proposal. It is still extremely disorganized and it is not clear what your goals are and how you have split them up into tasks. It contains lots of copy paste of comments/statements from this JIRA or comments from the proposal itself. The level of details still seems inadequate to me. I had proposed a possible way to structure your proposal (by splitting the three months into three different areas of focus, all of them I specified in the comments), but I don't see that you've done so. I asked you to find out, at least, what the default Similarity in Lucene is called (and to attempt to simulate the scoring for that on the GPU). It seems you have not done so. At this point, I don't think much can be done (just 2 hours to go for submission). Wish you the best. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953288#comment-15953288 ] vikash commented on LUCENE-7745: Hi Ishaan , I have changed the proposal according to your instructions, can you review it again? > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952849#comment-15952849 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- I have left initial comments on your draft. Let me know if you want another round of review, perhaps after you've addressed the current comments. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952838#comment-15952838 ] vikash commented on LUCENE-7745: Yeah I had already read the student manual and the deadline is 3rd April and its too close, in the preparation I had almost missed the deadline for application. OK so for the proposal my draft is here (you may comment on it and I will do the needful) https://docs.google.com/document/d/1HGxU1ZudNdAboj0s9WKTWJk3anbZm86JY-abaflXoEI/edit?usp=sharing . > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952388#comment-15952388 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Hi Vikash, I suggest you read the student manuals for GSoC. Submit a proposal how you want to approach this project, including technical details (as much as possible) and detailed timelines. Regarding the following: {code} 1First, understand how BooleanScorer calls these similarity classes and does the scoring. There are unit tests in Lucene that can help you get there. This might help: https://wiki.apache.org/lucene-java/HowToContribute 2Write a standalone CUDA/OpenCL project that does the same processing on the GPU. 3Benchmark the speed of doing so on GPU vs. speed observed when doing the same through the BooleanScorer. Preferably, on a large resultset. Include time for copying results and scores in and out of the device memory from/to the main memory. 4 Optimize step 2, if possible. {code} If you've already understood step 1, feel free to make a proposal on how you will use your GSoC coding time to achieve steps 2-4. Also, you can look at other stretch goals to be included in the coding time. I would consider that steps 2-4, if done properly and successfully, is itself a good GSoC contribution. And if these steps are done properly, then either Lucene integration can be proposed for the latter part of the coding phase (last 2-3 weeks, I'd think), or exploratory work on other part of Lucene (apart from the BooleanScorer, e.g. spatial search filtering etc.) could be taken up. Time is running out, so kindly submit a proposal as soon as possible. You can submit a draft first, have one of us review it and then submit it as final after the review. If the deadline is too close, there might not be enough time for this round of review, and in such a case just submit the draft as final. Also, remember a lot of the GPGPU coding is done on C, so familiarity/experience with that is a plus. (Just a suggestion that makes sense to me, and feel free to ignore: bullet points work better than long paragraphs, even though the length of sentences can remain the same) > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952375#comment-15952375 ] vikash commented on LUCENE-7745: Hello all, I have been reading a lot about GPU working and GPU parallelization in particularly about General Purpose computing on Graphics Processing Units and have also looked into in detail the source code of the BooleanScorer.java file , its a nice thing and I am having no difficulty understanding its working since Java is my speciality so the job was quite fun . There are a few things that seem unclear to me but I am reading and experimenting so I will resolve them soon. It is a nice idea to use gpu to perform the search and indexing operations on a document using the GPU and that would be faster using the GPU. And regarding the licencing issue, since we are generating code and as it was said above the code that we generate may not go to Lucene necessarily so assuming this happens then will licencing still be an issue if we use the libraries in our code? And as Uwe Schindler said we may host the code on github and certainly it would not be a good idea to develop code for special hardware, but still we can give it a try and try to make it compatible with most of the hardwares. I dont mind if this code does not go to Lucene, but we can try to change lucene and make it better and I am preparing myself for it and things would stay on track with your kind mentorship . So should I submit my proposal now or do I need to complete all the four steps that Ishaan told to do in the last comment and then submit my proposal? And which one of the ideas would you prefer to mentor me on that is which one do you think would be a better one to continue with? >Copy over and index lots of points and corresponding docids to the GPU as an >offline, one time operation. Then, given a query point, return top-n nearest >indexed points. >Copy over and index lots of points and corresponding docids to the GPU as an >offline, one time operation. Then, given a polygon (complex shape), return all >points that lie inside the polygon. >Benchmarking an aggregation over a DocValues field and comparing the >corresponding performance when executed on the GPU. >Benchmarking the speed of calculations on GPU vs. speed observed when doing >the same through the BooleanScorer. Preferably, on a large result set with the >time for copying results and scores in and out of the device memory from/to >the main memory included? -Vikash > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945637#comment-15945637 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- bq. Java CUDA libraries exist and what their licenses jCuda happens to be MIT, which is, afaik, compatible with Apache license. http://www.jcuda.org/License.txt > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945632#comment-15945632 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Hi Vikash, Regarding licensing issue: The work done in this project would be exploratory. That code won't necessarily go into Lucene. When we are at a point where we see clear benefits from the work done here, we would then have to explore all aspects of productionizing it (including licensing). Regarding next steps: {quote} BooleanScorer calls a lot of classes, e.g. the BM25 similarity or TF-IDF to do the calculation that could possibly be parallelized. {quote} # First, understand how BooleanScorer calls these similarity classes and does the scoring. There are unit tests in Lucene that can help you get there. This might help: https://wiki.apache.org/lucene-java/HowToContribute # Write a standalone CUDA/OpenCL project that does the same processing on the GPU. # Benchmark the speed of doing so on GPU vs. speed observed when doing the same through the BooleanScorer. Preferably, on a large resultset. Include time for copying results and scores in and out of the device memory from/to the main memory. # Optimize step 2, if possible. Once this is achieved (which in itself could be a sufficient GSoC project), one can have stretch goals to try out other parts of Lucene to optimize (e.g. spatial search). > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945065#comment-15945065 ] David Smiley commented on LUCENE-7745: -- vikash: not all working code contributed to any open-source project is necessarily welcome. Usually it is but sometimes project members or ASF rules insist on certain things for the perceived greater good. In this case, I believe Uwe doesn't want Lucene to include anything that would only work with certain hardware or JVM vendors -- even if it was optional opt-in. If hypothetically nobody had such concerns here, be aware that any 3rd party (non-ASF) libraries need to meet certain qualifications. For example, *if* whatever Java CUDA library you find happens to be licensed as GPL, then it's incompatible with ASF run projects like this one. That's a hypothetical; I have no idea what Java CUDA libraries exist and what their licenses are. Regardless... if you come up with something useful, it's probably not necessary that Lucene itself change, and as seen here we have some willingness to change Lucene (details TBD) if it enables people to use Lucene with CUDA. Lucene has many extension points already. Though I could imagine you might unfortunately need to copy/fork some long source files -- Uwe mentioned some. Good luck. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15944884#comment-15944884 ] vikash commented on LUCENE-7745: Hi all, I have been reading about GPU acceleration and in particular to be precise about GPU accelerated computing I find this project very interesting and so can anyone give me further lead what is to be done now? I mean the ideas that Ishaan suggested are pretty good but I am still not able to understand that what Mr David means by (a) could whatever comes of this actually be contributed to Lucene itself, why can the outcome of this project not be contributed to Lucene? > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932714#comment-15932714 ] Uwe Schindler commented on LUCENE-7745: --- Hi, in General, including CUDA into Lucene may be a good idea, but I see no real possibility to do this inside Lucene Core or any other module. My idea would be to add some abstraction to the relevant parts of Lucene and make it easier to "plug in" different implementations. Then this code could also be hosted outside Lucene (if Licenses is a problem) anywhere on Github. We still should have the following in our head: Mike's example looks "simple" as a quick test if we see gains, but making the whole thing ready for commit or bundling in any project in/outside Lucene is a whole different story. Currently BooleanScorer calls a lot of classes, e.g. the BM25 similarity or TF-IDF to do the calculation that could possibly be parallelized. But for moving all this to CUDA, you have to add "plugin points" all there and change APIs completely. It is also hard to test, because none of our Jenkins servers has a GPU! Also for uses of Lucene, this could be a huge problem, if we add native stuff into Lucene that they may never use. Because of that it MUST BE SEPARATED from Lucene core. Completely... IMHO, I'd create a full new search engine like CLucene in C code if I would solely focus on GPU parallelization. The current iterator based approaches are not easy to transform or plug into CUDA... For the GSoc project, we should make sure to the GSoc student that this is just a project to "explore" GPU acceleration: if it brings any performance - I doubt that, because the call overhead between Java and CUDA is way too high - in contrast to Postgres where all in plain C/C++. The results would then be used to plan and investigate ways how to include that into Lucene as "plugin points" (e.g., as SPI modules). > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932604#comment-15932604 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- [~dsmiley], that is a very important question. Afaik, there is no Apache compatible GPGPU framework. Both OpenCL and CUDA are likely incompatible with Apache (I am not fully sure). I see that jCUDA is MIT license, which is a wrapper around the native libraries. If there are benefits to using GPGPU processing, my thought is that we can ensure all necessary plumbing in our codebase in order to offload processing to some plugin, whereby the user can plugin the exact GPU kernels from outside the Lucene distribution (if those kernels also violate any licensing restrictions we have). If there are clear benefits in speeding things up using a GPU, it would not be, *for the end-user*, the end of the world if the code comes outside Apache distribution. bq. If (a) is a "no", we need to be honest up front with the contributor. That is a good point, and we can document this clearly. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932589#comment-15932589 ] Michael McCandless commented on LUCENE-7745: Maybe even the basic hit scoring that e.g. BooleanScorer does with disjunction of high frequency terms, would be amenable to GPU acceleration? Today BooleanScorer processes a whole window of hits at once, doing fairly simple math (the Similarity methods) on each. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932568#comment-15932568 ] David Smiley commented on LUCENE-7745: -- I have a question to us all. (a) could whatever comes of this actually be contributed to Lucene itself given the likelihood of requiring native O.S. bindings (lets presume in spatial-extras as it seems this is the only module that can have an external dependency), and (b) does that matter for GSOC or to the expectations of the contributor? If (a) is a "no", we need to be honest up front with the contributor. I know in the past Solr has been denied off-heap filters that would have required a un-pure Java approach. A native binding would be another degree of un-purity :-) > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932556#comment-15932556 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Another experiment that, I think, is worth trying out: * Benchmarking an aggregation over a DocValues field (e.g. using sqrt(), haversine distance etc.), and comparing the corresponding performance when executed on the GPU. This could potentially speed up scoring of results. For reference, Postgresql seems to have experienced speedup in some areas (esp. aggregations over column oriented fields): https://www.slideshare.net/kaigai/gpgpu-accelerates-postgresql > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931975#comment-15931975 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Here are some ideas on things to start out with: # Copy over and index lots of points and corresponding docids to the GPU as an offline, one time operation. Then, given a query point, return top-n nearest indexed points. # Copy over and index lots of points and corresponding docids to the GPU as an offline, one time operation. Then, given a polygon (complex shape), return all points that lie inside the polygon. In both the cases, compare performance against existing Lucene spatial search. One would need to choose the most suitable algorithm for doing these as efficiently as possible. Any GPGPU API can be used for now (OpenCL, CUDA) for initial exploration. [~dsmiley], [~kwri...@metacarta.com], [~nknize], [~mikemccand], given your depth and expertise in this area, do you have any suggestions? Any other area of Lucene that comes to mind which should be easiest to start with, in terms of exploring GPU based parallelization? > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928714#comment-15928714 ] vikash commented on LUCENE-7745: Hi I am willing to work on this. > Explore GPU acceleration > > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya > Labels: gsoc2017, mentor > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org