Re: Counters question - is there a better way to count
How many distinct uid,someid pairs will you have? On Dec 5, 2013 3:44 PM, "Christopher Wirt" wrote: > I want to build a really simple column family which counts the occurrence > of a single event X. > > > > Once we reach Y occurrences of X the counter resets to 0 > > > > The obvious way to do this is with a counter CF. > > > > CREATE TABLE xcounter1 ( > > id uuid, > > someid int, > > count counter > > ) PRIMARY KEY (uid, someid) > > > > This is how I’ve always done it in the past, but I’ve been told to avoid > counters for various reasons, performance, consistency etc.. > > I’m not too bothered about 100% absolute consistency, however read > performance is certainly a big concern. > > > > So I was thinking to avoid using counters I could do something like this. > > > > CREATE TABLE xcounter2 ( > > id uuid, > > someid int, > > time timeuuid > > ) PRIMARY KEY (uid, someid, time) > > > > Then retrieve all events and count in memory. Delete all id, someid > records once I hit Y. > > > > Or I could > > CREATE TABLE xcounter3 ( > > id uuid, > > someid int, > > time timeuuid, > > Ycount int > > ) PRIMARY KEY (uid, someid, time) > > > > Insert a ‘Ycount’ on each occurrence of the event. > > Only retrieve the last Y value inserted on reading > > Then delete all records once I hit the magic Y value. > > > > > > Anyone have any interesting thoughts or insight on what is likely to give > me the best read performance? > > There will be 100’s of someid to each id. Reads will be 5-10x the writes. > > > > > > Thanks, > > > > Chris >
Re: Is there update-in-place on maps?
Counters can be atomically incremented ( http://wiki.apache.org/cassandra/Counters). Pick a UUID for the counter, and use that: c=map.get(k); c.incr() On 6 August 2013 11:01, Jan Algermissen wrote: > > On 06.08.2013, at 11:36, Andy Twigg wrote: > > > Store pointers to counters as map values? > > Sorry, but this fits into nothing I know about C* so far - can you explain? > > Jan > > -- Dr Andy Twigg Junior Research Fellow, St Johns College, Oxford Room 351, Department of Computer Science http://www.cs.ox.ac.uk/people/andy.twigg/ andy.tw...@cs.ox.ac.uk | +447799647538
Re: Is there update-in-place on maps?
Store pointers to counters as map values?
Re: random thoughts for MUCH faster key lookup in cassandra
How would you implement range queries? On 29 May 2013 17:49, Hiller, Dean wrote: > We recently ran into too much data in one CF because LCS can't really run > in parallel on one CF in a single tier which got me thinking, why doesn't > the CF directoy have 100 or 1000 directories 0-999 and cassandra hash the > key to which directory it would go in and then put it in one of the > sstables in that directory. This would lead to > > 1. Parallel compaction of LCS in a single CF Yeah, faster > compactions since there is less to sort in each directory(and it can be > done in parallel too) > 2. Help with fast key lookups as it hashes to one of the 1000 > directories very quickly and then just needs to find the key in one of the > sstables which are sorted (there would be 1000x less sstables in each > directory than in one big CF) > > Am I on crack here? Or does that seem like it would be a pretty good > direction to go? > > Maybe this is only because our system has 98% of it's data in one CF while > other systems have 10% of their data in each CF though. I still tend to > think a lot of people will end up with 80% of their data in one CF and 20% > in all the other CF's…isn't pareto's principal a natural tendency and if it > is, maybe the above feature should be considered? > > Later, > Dean > -- Dr Andy Twigg Junior Research Fellow, St Johns College, Oxford Room 351, Department of Computer Science http://www.cs.ox.ac.uk/people/andy.twigg/ andy.tw...@cs.ox.ac.uk | +447799647538
Re: subscribe request
i was hoping for a rick roll. On 14 February 2013 16:55, Eric Evans wrote: > This is new. > > On Thu, Feb 14, 2013 at 9:24 AM, Muntasir Raihan Rahman > wrote: >> >> >> -- >> Best Regards >> Muntasir Raihan Rahman >> Email: muntasir.rai...@gmail.com >> Phone: 1-217-979-9307 >> Department of Computer Science, >> University of Illinois Urbana Champaign, >> 3111 Siebel Center, >> 201 N. Goodwin Avenue, >> Urbana, IL 61801 > > > > -- > Eric Evans > Acunu | http://www.acunu.com | @acunu
Fwd: British Conference on Databases 2013 - Big Data special.
Dear all, [Apologies if you receive this CFP multiple times or are uninterested] I am organizing the British Conference on Databases(BNCOD) this year and we would very much like to see some industrial contributions around Big Data. How have you used Hadoop, HBase, Cassandra, Machine learning techniques in a way that others might want to know about? All contributions are peer-reviewed so the quality should be somewhat high. Get in touch with me if you have any questions. 29th British National Conference on Databases University of Oxford, United Kingdom 8-10 July 2013 http://www.cs.ox.ac.uk/bncod2013/ CALL FOR PAPERS Abstract deadline: 31 January, 2013 Paper deadline: 7 February 2013 BNCOD 2013 seeks research papers for presentation at the conference and subsequent publication. It welcomes research papers on a broad range of topics related to data-centric computation. For some years, every edition of BNCOD has centred around a main theme, acting as a focal point for keynote addresses, tutorials, and research papers. The theme of BNCOD 2013 will be Big Data; it encompases a growing need to manage data that is too big, too fast, or too hard for the existing technology (Sam Madden: From Databases to Big Data. IEEE Internet Computing 16(3): 4-6 (2012)). BNCOD promises a very exciting programme featuring keynotes and tutorials by distinguished researchers. Christoph Koch will speak on Compilation and Synthesis in Big Data Analytics and Dan Suciu will speak on Big Data Begets Big Database Theory. There will be a further keynote by Peter Buneman. There will also be tutorials on Querying Big Social Data by Wenfei Fan and on Big Data Analytics by Chris Re. TOPICS OF INTERESTS The topics listed below are intended as a sample; we encourage submissions on all data-centric topics. Systems for Data Management: data system architecture; storage, replication and consistency; physical representations; query and dataflow processing Scalable Data Analysis: complex queries and search; approximate querying; scalable statistical methods; management of uncertainty and reasoning at scale; data privacy and security; data mining and knowledge discovery Management of Very Large Data Systems: availability; adaptivity and self-tuning; power management; virtualization Data Models and Languages: XML and semi-structured data; multi-media, temporal and spatial data; data streams; declarative languages; language interfaces for databases Domain-Specific Data Management: methods and systems for science; networks and mobility; ubiquitous computing; sensor databases Management of Web and Heterogeneous Data: information extraction; information integration; meta-data management; data cleaning; service oriented architectures User Interfaces and Social Data: data visualization; collaborative data analysis and curation; social networks; email and messaging analytics Data and Knowledge: Knowledge base management, reasoning over incomplete and/or inconsistent data, ontology-based data access, ontology querying, semantic query optimization, storing and manipulating RDF data. SUBMISSION GUIDELINES The conference management tool for the submission of abstracts and papers is accessible at http://www.easychair.org/conferences/?conf=bncod2013 Full papers (from 12 to 14 pages), short papers (from 4 to 10 pages), system descriptions and demonstrations (from 4 to 10 pages) may be submitted. As in previous years, papers will be published by Springer as a volume in the Lecture Notes in Computer Science (LNCS) series. Accepted papers will only be published if they are presented in person by a registered author at the conference. Submissions are reviewed in a single-blind manner. They must be in PDF and formatted according to the Springer guidelines for the LNCS series: http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0 -- Dr Andy Twigg Junior Research Fellow, St Johns College, Oxford Room 351, Department of Computer Science http://www.cs.ox.ac.uk/people/andy.twigg/ andy.tw...@cs.ox.ac.uk | +447799647538 -- Dr Andy Twigg Junior Research Fellow, St Johns College, Oxford Room 351, Department of Computer Science http://www.cs.ox.ac.uk/people/andy.twigg/ andy.tw...@cs.ox.ac.uk | +447799647538