Secondary index reverse sort
I've seen that the results of secondary index queries are sorted on index values by default. I was wondering if there's something I'm missing that would allow me to fetch those keys but reverse sorted. I have indexes based on UNIX timestamps and I'd like to grab the most recent keys. I'd like this query to be running on demand so I'd like to avoid MapReduce if at all possible. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reuse of Buckets (Java Client)
Is the Bucket class reusable and thread-safe? I.e. can I create my Bucket objects during instantiation of client class, and then reuse the same bucket for all operations for the application lifetime? Or should buckets be re-created for each request? Thanks -- Nico Huysamen Senior Software Developer | Ad Dynamo ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Reuse of Buckets (Java Client)
Yes, it is thread safe, you can treat them as singleton instances per bucket, the following order is the kind of the general usage pattern: * Fetch bucket. * Optional: If exists verify it has your application values (N value, etc) * If doesn't exist create it with your settings. * Cache it as a singleton instance (You could create a /final MapString, Bucket buckets=new HashMap()/) and re-use it in your application; assuming your initialization is not lazy, or if it is use proper thread safety initialization. Hope that helps, Guido. On 31/07/13 08:45, Nico Huysamen wrote: Is the Bucket class reusable and thread-safe? I.e. can I create my Bucket objects during instantiation of client class, and then reuse the same bucket for all operations for the application lifetime? Or should buckets be re-created for each request? Thanks ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: combining Riak (CS) and Spark/shark by speaking over s3 protocol
Dan, Not sure if I understand the renaming objects-problem in Riak CS. Can you elaborate? I believe smaller object sizes would not be nearly as efficient as working with plain Riak if only because of the overhead incurred by Riak CS. Does this mean lack of efficiency in disk storage, in-mem or both? Moreover I'm having this nagging thought that having to dig through the manifest to find the blocks will severely impact read latency for (multi-key) lookups as opposed to the normal bitcask / levelDB lookup. Is this correct? Best, Geert-Jan 2013/7/30 Dan Kerrigan dan.kerri...@gmail.com Geert-Jan - We're currently working on a somewhat similar project to integrate Flume to ingest data into Riak CS for later processing using Hadoop. The limitations of HDFS/S3, when using the s3:// or s3n:// URIs, seem to revolve around renaming objects (copy/delete) in Riak CS. If you can avoid that, this link should work fine. Regarding how data is stored in Riak CS, the data block storage is Bitcask with manifest storage being held in LevelDB. Riak CS is optimized for larger object sizes and I believe smaller object sizes would not be nearly as efficient as working with plain Riak if only because of the overhead incurred by Riak CS. The benefits of Riak generally carry over to Riak CS so there shouldn't be any need to worry about losing raw power. Respectfully - Dan Kerrigan On Tue, Jul 30, 2013 at 2:21 PM, gbrits gbr...@gmail.com wrote: This may be totally missing the mark but I've been reading up on ways to do fast iterative processing in Storm or Spark/shark, with the ultimate goal of results ending up in Riak for fast multi-key retrieval. I want this setup to be as lean as possible for obvious reasons so I've started to look more closely at the possible Riak CS / Spark combo. Apparently, please correct if wrong, Riak CS sits on top of Riak and is S3-api compliant. Underlying the db for the objects is levelDB (which would have been my choice anyway, bc of the low in-mem key overhead) Apparently Bitcask is also used, although it's not clear to me what for exactly. At the same time Spark (with Shark on top, which is what Hive is for Hadoop if that in any way makes things clearer) can use HDFS or S3 as it's so called 'deep store'. Combining this it seems, Riak CS and Spark/Shark could be a nice pretty tight combo providing interative and adhoc quering through Shark + all the excellent stuff of Riak through the S3 protocol which they both speak . Is this correct? Would I loose any of the raw power of Riak when going with Riak CS? Anyone ever tried this combo? Thanks, Geert-Jan -- View this message in context: http://riak-users.197444.n3.nabble.com/combining-Riak-CS-and-Spark-shark-by-speaking-over-s3-protocol-tp4028621.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: combining Riak (CS) and Spark/shark by speaking over s3 protocol
Thanks for the links Mark. Certainly looks possible to me. A Riak + Spark/Shark setup almost looks like a match made in heaven. So i'm doing my due diligence before getting too excited, since there's not too much work around combining the two, suggesting I might be overlooking something. Going to try the setup and see what comes out. 2013/7/31 Mark Hamstra [via Riak Users] ml-node+s197444n402862...@n3.nabble.com Others have certainly found benefits in combining Spark/Shark with a Dynamo-type KV-store. With robust Hadoop Input/OutputFormats it's not too difficult (e.g. see thishttp://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-finaland this http://tuplejump.github.io/calliope/), and It may be possible to do as you suggest with the s3 API of Riak CS. What also may be worth exploring is if Riak and Spark/Shark can rendezvous via Tachyonhttps://github.com/amplab/tachyon/wiki. That would be more of a research project right now, but it could end up someplace interesting. On Tue, Jul 30, 2013 at 1:24 PM, Dan Kerrigan [hidden email]http://user/SendEmail.jtp?type=nodenode=4028629i=0 wrote: Geert-Jan - We're currently working on a somewhat similar project to integrate Flume to ingest data into Riak CS for later processing using Hadoop. The limitations of HDFS/S3, when using the s3:// or s3n:// URIs, seem to revolve around renaming objects (copy/delete) in Riak CS. If you can avoid that, this link should work fine. Regarding how data is stored in Riak CS, the data block storage is Bitcask with manifest storage being held in LevelDB. Riak CS is optimized for larger object sizes and I believe smaller object sizes would not be nearly as efficient as working with plain Riak if only because of the overhead incurred by Riak CS. The benefits of Riak generally carry over to Riak CS so there shouldn't be any need to worry about losing raw power. Respectfully - Dan Kerrigan On Tue, Jul 30, 2013 at 2:21 PM, gbrits [hidden email]http://user/SendEmail.jtp?type=nodenode=4028629i=1 wrote: This may be totally missing the mark but I've been reading up on ways to do fast iterative processing in Storm or Spark/shark, with the ultimate goal of results ending up in Riak for fast multi-key retrieval. I want this setup to be as lean as possible for obvious reasons so I've started to look more closely at the possible Riak CS / Spark combo. Apparently, please correct if wrong, Riak CS sits on top of Riak and is S3-api compliant. Underlying the db for the objects is levelDB (which would have been my choice anyway, bc of the low in-mem key overhead) Apparently Bitcask is also used, although it's not clear to me what for exactly. At the same time Spark (with Shark on top, which is what Hive is for Hadoop if that in any way makes things clearer) can use HDFS or S3 as it's so called 'deep store'. Combining this it seems, Riak CS and Spark/Shark could be a nice pretty tight combo providing interative and adhoc quering through Shark + all the excellent stuff of Riak through the S3 protocol which they both speak . Is this correct? Would I loose any of the raw power of Riak when going with Riak CS? Anyone ever tried this combo? Thanks, Geert-Jan -- View this message in context: http://riak-users.197444.n3.nabble.com/combining-Riak-CS-and-Spark-shark-by-speaking-over-s3-protocol-tp4028621.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list [hidden email] http://user/SendEmail.jtp?type=nodenode=4028629i=2 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list [hidden email] http://user/SendEmail.jtp?type=nodenode=4028629i=3 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list [hidden email] http://user/SendEmail.jtp?type=nodenode=4028629i=4 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- If you reply to this email, your message will be added to the discussion below: http://riak-users.197444.n3.nabble.com/combining-Riak-CS-and-Spark-shark-by-speaking-over-s3-protocol-tp4028621p4028629.html To unsubscribe from combining Riak (CS) and Spark/shark by speaking over s3 protocol, click herehttp://riak-users.197444.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4028621code=Z2JyaXRzQGdtYWlsLmNvbXw0MDI4NjIxfDExNjk3MTIyNTA= . NAMLhttp://riak-users.197444.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml --
Re: Secondary index reverse sort
Hi Lucas, I'm sorry, as easy as it would have been to add with the latest changes, we just ran out of time. It is something I'd love to add in future. Or maybe something a contributor could add? (Happy to advise / review.) Many thanks Russell On 31 Jul 2013, at 02:04, Lucas Cooper bobobo1...@gmail.com wrote: I've seen that the results of secondary index queries are sorted on index values by default. I was wondering if there's something I'm missing that would allow me to fetch those keys but reverse sorted. I have indexes based on UNIX timestamps and I'd like to grab the most recent keys. I'd like this query to be running on demand so I'd like to avoid MapReduce if at all possible. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Secondary index reverse sort
I'm happy to wait, it isn't urgently needed as my project is still in development. I'd contribute myself if I was confident at all programming in Erlang but I'm still just getting into declarative languages at the moment :) On Wed, Jul 31, 2013 at 10:03 PM, Russell Brown russell.br...@mac.comwrote: Hi Lucas, I'm sorry, as easy as it would have been to add with the latest changes, we just ran out of time. It is something I'd love to add in future. Or maybe something a contributor could add? (Happy to advise / review.) Many thanks Russell On 31 Jul 2013, at 02:04, Lucas Cooper bobobo1...@gmail.com wrote: I've seen that the results of secondary index queries are sorted on index values by default. I was wondering if there's something I'm missing that would allow me to fetch those keys but reverse sorted. I have indexes based on UNIX timestamps and I'd like to grab the most recent keys. I'd like this query to be running on demand so I'd like to avoid MapReduce if at all possible. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Secondary index reverse sort
As a workaround, you can always store (2^31 - timestamp) as an additional index and use that when you need to do the reverse retrieval. Beware 2038. Jon On Wed, Jul 31, 2013 at 1:07 PM, Lucas Cooper bobobo1...@gmail.com wrote: I'm happy to wait, it isn't urgently needed as my project is still in development. I'd contribute myself if I was confident at all programming in Erlang but I'm still just getting into declarative languages at the moment :) On Wed, Jul 31, 2013 at 10:03 PM, Russell Brown russell.br...@mac.comwrote: Hi Lucas, I'm sorry, as easy as it would have been to add with the latest changes, we just ran out of time. It is something I'd love to add in future. Or maybe something a contributor could add? (Happy to advise / review.) Many thanks Russell On 31 Jul 2013, at 02:04, Lucas Cooper bobobo1...@gmail.com wrote: I've seen that the results of secondary index queries are sorted on index values by default. I was wondering if there's something I'm missing that would allow me to fetch those keys but reverse sorted. I have indexes based on UNIX timestamps and I'd like to grab the most recent keys. I'd like this query to be running on demand so I'd like to avoid MapReduce if at all possible. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Jon Meredith VP, Engineering Basho Technologies, Inc. jmered...@basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: combining Riak (CS) and Spark/shark by speaking over s3 protocol
Geert-Jan Riak CS currently doesn't support the S3 Copy command. Flume and Hadoop distcp create a temporary object and then attempts to Copy that object to it's permanent location. Rename is a Copy then a Delete since the S3 API doesn't support Rename. Regarding efficiency, Riak CS block sizes are 1MB (100 MB object, 100 Riak Bitcask stored objects) so you can use the Bitcask calculator at [0] to get a rough estimate requirements to store your particular dataset. Regarding the impact to read latency, severe is probably not the right word but there is an impact. Besides API support, your decision will, in part, come down to how large your object sizes are going to be. The Riak FAQ [1] currently suggests that Riak Object sizes should be less than 10MB. Riak CS on the other hand can handle object sizes up to 5TB. If you are doing multi-key retrieves for lots of small objects, Riak looks like the right choice otherwise, go with Riak CS. Some basic testing would go a long way to find the balance in your case. Respectfully - Dan Kerrigan [0] http://docs.basho.com/riak/latest/ops/building/planning/bitcask/ [1] http://docs.basho.com/riak/latest/community/faqs/developing/#is-there-a-limit-on-the-file-size-that-can-be-stor On Wed, Jul 31, 2013 at 4:43 AM, Geert-Jan Brits gbr...@gmail.com wrote: Dan, Not sure if I understand the renaming objects-problem in Riak CS. Can you elaborate? I believe smaller object sizes would not be nearly as efficient as working with plain Riak if only because of the overhead incurred by Riak CS. Does this mean lack of efficiency in disk storage, in-mem or both? Moreover I'm having this nagging thought that having to dig through the manifest to find the blocks will severely impact read latency for (multi-key) lookups as opposed to the normal bitcask / levelDB lookup. Is this correct? Best, Geert-Jan 2013/7/30 Dan Kerrigan dan.kerri...@gmail.com Geert-Jan - We're currently working on a somewhat similar project to integrate Flume to ingest data into Riak CS for later processing using Hadoop. The limitations of HDFS/S3, when using the s3:// or s3n:// URIs, seem to revolve around renaming objects (copy/delete) in Riak CS. If you can avoid that, this link should work fine. Regarding how data is stored in Riak CS, the data block storage is Bitcask with manifest storage being held in LevelDB. Riak CS is optimized for larger object sizes and I believe smaller object sizes would not be nearly as efficient as working with plain Riak if only because of the overhead incurred by Riak CS. The benefits of Riak generally carry over to Riak CS so there shouldn't be any need to worry about losing raw power. Respectfully - Dan Kerrigan On Tue, Jul 30, 2013 at 2:21 PM, gbrits gbr...@gmail.com wrote: This may be totally missing the mark but I've been reading up on ways to do fast iterative processing in Storm or Spark/shark, with the ultimate goal of results ending up in Riak for fast multi-key retrieval. I want this setup to be as lean as possible for obvious reasons so I've started to look more closely at the possible Riak CS / Spark combo. Apparently, please correct if wrong, Riak CS sits on top of Riak and is S3-api compliant. Underlying the db for the objects is levelDB (which would have been my choice anyway, bc of the low in-mem key overhead) Apparently Bitcask is also used, although it's not clear to me what for exactly. At the same time Spark (with Shark on top, which is what Hive is for Hadoop if that in any way makes things clearer) can use HDFS or S3 as it's so called 'deep store'. Combining this it seems, Riak CS and Spark/Shark could be a nice pretty tight combo providing interative and adhoc quering through Shark + all the excellent stuff of Riak through the S3 protocol which they both speak . Is this correct? Would I loose any of the raw power of Riak when going with Riak CS? Anyone ever tried this combo? Thanks, Geert-Jan -- View this message in context: http://riak-users.197444.n3.nabble.com/combining-Riak-CS-and-Spark-shark-by-speaking-over-s3-protocol-tp4028621.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Secondary index reverse sort
That should work well in development for my application actually. Thanks for the tip! On Wed, Jul 31, 2013 at 10:26 PM, Jon Meredith jmered...@basho.com wrote: As a workaround, you can always store (2^31 - timestamp) as an additional index and use that when you need to do the reverse retrieval. Beware 2038. Jon On Wed, Jul 31, 2013 at 1:07 PM, Lucas Cooper bobobo1...@gmail.comwrote: I'm happy to wait, it isn't urgently needed as my project is still in development. I'd contribute myself if I was confident at all programming in Erlang but I'm still just getting into declarative languages at the moment :) On Wed, Jul 31, 2013 at 10:03 PM, Russell Brown russell.br...@mac.comwrote: Hi Lucas, I'm sorry, as easy as it would have been to add with the latest changes, we just ran out of time. It is something I'd love to add in future. Or maybe something a contributor could add? (Happy to advise / review.) Many thanks Russell On 31 Jul 2013, at 02:04, Lucas Cooper bobobo1...@gmail.com wrote: I've seen that the results of secondary index queries are sorted on index values by default. I was wondering if there's something I'm missing that would allow me to fetch those keys but reverse sorted. I have indexes based on UNIX timestamps and I'd like to grab the most recent keys. I'd like this query to be running on demand so I'd like to avoid MapReduce if at all possible. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Jon Meredith VP, Engineering Basho Technologies, Inc. jmered...@basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Installing protobuf 2.5.0 for Riak Python 2.0
Matt, For compatibility reasons, we use 2.4.1, which is pinned in the requirements of the riak_pb package. We intend to move to 2.5 for later releases. On Tue, Jul 30, 2013 at 11:02 PM, Matt Black matt.bl...@jbadigital.comwrote: Hello list, I've been eagerly awaiting the latest update to the python bindings, so it was with great enthusiasm that I started on it this morning! However, I'm unable to install the latest v2.5 of protobuf. Has anyone else had problems? Presumably it works for others on different setups. (Sean?) (test)vagrant@boomerang:/tmp/protobuf-2.5.0/python python setup.py build running build running build_py Generating google/protobuf/unittest_pb2.py... google/protobuf/unittest_import.proto:53:8: Expected a string naming the file to import. google/protobuf/unittest.proto: Import google/protobuf/unittest_import.proto was not found or had errors. google/protobuf/unittest.proto:97:12: protobuf_unittest_import.ImportMessage is not defined. google/protobuf/unittest.proto:101:12: protobuf_unittest_import.ImportEnum is not defined. google/protobuf/unittest.proto:107:12: protobuf_unittest_import.PublicImportMessage is not defined. google/protobuf/unittest.proto:135:12: protobuf_unittest_import.ImportMessage is not defined. google/protobuf/unittest.proto:139:12: protobuf_unittest_import.ImportEnum is not defined. google/protobuf/unittest.proto:165:12: protobuf_unittest_import.ImportEnum is not defined. google/protobuf/unittest.proto:216:12: protobuf_unittest_import.ImportMessage is not defined. google/protobuf/unittest.proto:221:12: protobuf_unittest_import.ImportEnum is not defined. google/protobuf/unittest.proto:227:12: protobuf_unittest_import.PublicImportMessage is not defined. google/protobuf/unittest.proto:256:12: protobuf_unittest_import.ImportMessage is not defined. google/protobuf/unittest.proto:261:12: protobuf_unittest_import.ImportEnum is not defined. google/protobuf/unittest.proto:291:12: protobuf_unittest_import.ImportEnum is not defined. vagrant@boomerang:~ uname -a Linux boomerang 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux vagrant@boomerang:~ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 12.04.2 LTS Release:12.04 Codename: precise ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Sean Cribbs s...@basho.com Software Engineer Basho Technologies, Inc. http://basho.com/ ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: combining Riak (CS) and Spark/shark by speaking over s3 protocol
I appreciate the clarification. Objectsize for the dataset I'm currently investigating is 8KB (on the bit exactly) so that would be loads of overhead when going the Riak CS route. Upfront I already figured Riak directly was a more efficient way to go, but getting a nice Riak + Spark/Shark integration going (through S3) is worth a lot to me as well. Some experimenting to do I guess :) Thanks, Geert-Jan 2013/7/31 Dan Kerrigan [via Riak Users] ml-node+s197444n4028644...@n3.nabble.com Geert-Jan Riak CS currently doesn't support the S3 Copy command. Flume and Hadoop distcp create a temporary object and then attempts to Copy that object to it's permanent location. Rename is a Copy then a Delete since the S3 API doesn't support Rename. Regarding efficiency, Riak CS block sizes are 1MB (100 MB object, 100 Riak Bitcask stored objects) so you can use the Bitcask calculator at [0] to get a rough estimate requirements to store your particular dataset. Regarding the impact to read latency, severe is probably not the right word but there is an impact. Besides API support, your decision will, in part, come down to how large your object sizes are going to be. The Riak FAQ [1] currently suggests that Riak Object sizes should be less than 10MB. Riak CS on the other hand can handle object sizes up to 5TB. If you are doing multi-key retrieves for lots of small objects, Riak looks like the right choice otherwise, go with Riak CS. Some basic testing would go a long way to find the balance in your case. Respectfully - Dan Kerrigan [0] http://docs.basho.com/riak/latest/ops/building/planning/bitcask/ [1] http://docs.basho.com/riak/latest/community/faqs/developing/#is-there-a-limit-on-the-file-size-that-can-be-stor On Wed, Jul 31, 2013 at 4:43 AM, Geert-Jan Brits [hidden email]http://user/SendEmail.jtp?type=nodenode=4028644i=0 wrote: Dan, Not sure if I understand the renaming objects-problem in Riak CS. Can you elaborate? I believe smaller object sizes would not be nearly as efficient as working with plain Riak if only because of the overhead incurred by Riak CS. Does this mean lack of efficiency in disk storage, in-mem or both? Moreover I'm having this nagging thought that having to dig through the manifest to find the blocks will severely impact read latency for (multi-key) lookups as opposed to the normal bitcask / levelDB lookup. Is this correct? Best, Geert-Jan 2013/7/30 Dan Kerrigan [hidden email]http://user/SendEmail.jtp?type=nodenode=4028644i=1 Geert-Jan - We're currently working on a somewhat similar project to integrate Flume to ingest data into Riak CS for later processing using Hadoop. The limitations of HDFS/S3, when using the s3:// or s3n:// URIs, seem to revolve around renaming objects (copy/delete) in Riak CS. If you can avoid that, this link should work fine. Regarding how data is stored in Riak CS, the data block storage is Bitcask with manifest storage being held in LevelDB. Riak CS is optimized for larger object sizes and I believe smaller object sizes would not be nearly as efficient as working with plain Riak if only because of the overhead incurred by Riak CS. The benefits of Riak generally carry over to Riak CS so there shouldn't be any need to worry about losing raw power. Respectfully - Dan Kerrigan On Tue, Jul 30, 2013 at 2:21 PM, gbrits [hidden email]http://user/SendEmail.jtp?type=nodenode=4028644i=2 wrote: This may be totally missing the mark but I've been reading up on ways to do fast iterative processing in Storm or Spark/shark, with the ultimate goal of results ending up in Riak for fast multi-key retrieval. I want this setup to be as lean as possible for obvious reasons so I've started to look more closely at the possible Riak CS / Spark combo. Apparently, please correct if wrong, Riak CS sits on top of Riak and is S3-api compliant. Underlying the db for the objects is levelDB (which would have been my choice anyway, bc of the low in-mem key overhead) Apparently Bitcask is also used, although it's not clear to me what for exactly. At the same time Spark (with Shark on top, which is what Hive is for Hadoop if that in any way makes things clearer) can use HDFS or S3 as it's so called 'deep store'. Combining this it seems, Riak CS and Spark/Shark could be a nice pretty tight combo providing interative and adhoc quering through Shark + all the excellent stuff of Riak through the S3 protocol which they both speak . Is this correct? Would I loose any of the raw power of Riak when going with Riak CS? Anyone ever tried this combo? Thanks, Geert-Jan -- View this message in context: http://riak-users.197444.n3.nabble.com/combining-Riak-CS-and-Spark-shark-by-speaking-over-s3-protocol-tp4028621.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list [hidden email]
Riak Recap for July 23-31
Greetings, Riak world. It's time to say farewell to July. There will be many opportunities to see the Basho crew out and about in August, so I've included the next couple of weeks below, and you can always visit http://basho.com/events to see what's coming to your community. Don't forget that the granddaddy Riak event of them all is looming: RICON West in San Francisco in October. Grab your early bird tickets while you can, and expect to hear more about the speakers very soon. http://ricon.io/west.html John twitter.com/macintux Riak Recap for July 23-31 === Basho announced a critical issue impacting 1.4.0 upgrades when Riak Control is running. - http://lists.basho.com/pipermail/riak-critical-issues_lists.basho.com/2013-July/04.html Brian Roach announced Riak Java Client 1.1.2, 1.4.0 (and shortly thereafter 1.4.1). - http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-July/012729.html - http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-July/012767.html Sean Cribbs belatedly announced riak-erlang-client 1.4.0. - http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-July/012796.html Now that Riak 1.4 is out the door, Basho engineers are seeking input on future changes. - Andrew Thompson: Adding security to Riak * http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-July/012730.html - Brian Roach: Removing HTTP support from the Java client * http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-July/012731.html - Sean Cribbs: Client autoconfig * http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-July/012743.html - Russell Brown: CRDTs * http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-July/012744.html O'Reilly published a gentle rant by Eric Redmond on choices in technology. - http://programming.oreilly.com/2013/07/nosql-choices-to-misfit-or-cargo-cult.html Basho on the Road === Tonight, Mark Phillips will be talking Riak 1.4 and RICON West at StackMob HQ in San Francisco - July 31 / http://www.meetup.com/San-Francisco-Riak-Meetup/events/129762112/ Tomorrow, Tom Santero and Andrew Stone will be talking consensus algorithms at the Erlang meetup in NYC - August 1 / http://www.meetup.com/Erlang-NYC/events/131394712/ A random sampling of Basho memelords will be present at the Riak meetup in Boston next Monday - August 5 / http://www.meetup.com/Boston-Riak/events/131992742/ Next Wednesday, in Norwich, England, Christian Dahlqvist will be talking about Riak design and data modeling - August 7 / http://www.meetup.com/Norfolk-Developers-NorDev/events/121000182/ Also on Wednesday, in Köln, Germany, Richard Shaw will be talking about Yokozuna - August 7 / http://www.nosql-cologne.org Next Thursday in Herndon, Virginia, Stuart McCaul will be presenting on Rovio's use of Riak; several other Bashoites will be in attendance - August 8 / http://www.meetup.com/Riak-DC/events/130773522/ Next Saturday, Hector Castro and Casey Rosenthal will be running a Riak workshop at FOSSCON - August 10 / http://fosscon.org Looking further out, several events take place the following Tuesday: Andy Gross will be in London talking distributed systems - August 13 / http://www.meetup.com/cloud-nosql/events/126742782/ Tom Santero will host a drinkup in Atlanta - August 13 / http://www.meetup.com/Atlanta-Riak-Meetup/events/131521272/ Pavan Venkatesh will lead a discussion of the changes to Riak 1.4 in Santa Monica - August 13 / http://www.meetup.com/Los-Angeles-Riak-Meetup/events/132040662/ ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
view total number of workers being used?
I was wondering if there's any way to see the total number of workers currently being used (workers used of worker_limit)? ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Unit testing persistence
Hi all ~ Great meetup today - looking forward to upgrading to 1.4 I had a question Mark suggested posting here, then we discussed with a few other folks too: How do we unit / integration test persistence with riak? Given a basic dev environment, e.g. running only one riak physical node locally with all configs default, how do we surely read the data we just wrote? I have tried setting DW=all (Durable Write - as recommended for best consistency in the financial example from the little riak book - section for developers, more than N/R/W) and tried also using {delete_mode, keep} in riak_kv app.config (since I truncate the buckets after each test suite), but still, I get intermittent test failures as eventually the data isn't available for reading right after writing. Please note I'm trying to avoid mocking / stubbing as well as hacks like keep trying to read until a certain timeout. I'm looking ideally for a simple configuration or any known best practices. Thanks ~W ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Unit testing persistence
Be advised, avoiding mocking/stubbing and making your tests unit tests are mutually exclusive. A unit test by definition should not have any dependencies whatsoever (on other modules even, let alone a database!). On Wed, Jul 31, 2013 at 9:41 PM, Wagner Camarao wag...@crunchbase.com wrote: Hi all ~ Great meetup today - looking forward to upgrading to 1.4 I had a question Mark suggested posting here, then we discussed with a few other folks too: How do we unit / integration test persistence with riak? Given a basic dev environment, e.g. running only one riak physical node locally with all configs default, how do we surely read the data we just wrote? I have tried setting DW=all (Durable Write - as recommended for best consistency in the financial example from the little riak book - section for developers, more than N/R/W) and tried also using {delete_mode, keep} in riak_kv app.config (since I truncate the buckets after each test suite), but still, I get intermittent test failures as eventually the data isn't available for reading right after writing. Please note I'm trying to avoid mocking / stubbing as well as hacks like keep trying to read until a certain timeout. I'm looking ideally for a simple configuration or any known best practices. Thanks ~W ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com