I assume that using the key i can get the all the columns like an array. Now i'd be using php to extract arraykey=>value in that array, just want to avoid that i.e i can directly print the column names. If you guys think its not a good idea I can drop it, anyways m new to it and a lot of things are coming to mind. As far as cassandra and columnfamily/ super columns are concerned i am pretty clear.
On Mon, Apr 12, 2010 at 12:23 AM, Benjamin Black <b...@b3k.us> wrote: > I have no idea what problem you are trying to solve. You are > misunderstanding a number of things about the Cassandra data model and > about how we are explaining it is best used. > > On Sun, Apr 11, 2010 at 11:37 AM, vineet daniel <vineetdan...@gmail.com> > wrote: > > Well my initial idea is to use value as column name, keeping key as an > > incremental integer. The discussion after each mail has drifted from this > > point which I had made. Will put it again. > > > > we want to store user information. We keep 1,2,3,4.....so on as keys. AND > > values as column names i.e rather than using column name 'first name', > i'd > > be using 'vineet' as column name, rather than using 'last name' as column > > name i'd be using 'daniel'. This way I can directly read column names as > > values. This is just a thought that has come to my mind while trying to > > design my db for cassandra. > > > > > > > > On Sun, Apr 11, 2010 at 11:46 PM, Benjamin Black <b...@b3k.us> wrote: > >> > >> Row keys must be unique. If your usernames are not unique and you > >> want to be able to query on them, you either need to figure out a way > >> to make them unique or treat the username rows themselves as indices, > >> which refer to a set of actually unique identifiers for users. > >> > >> On Sun, Apr 11, 2010 at 11:12 AM, vineet daniel <vineetdan...@gmail.com > > > >> wrote: > >> > its not a problem its a scenario, which we need to handle. And all I > am > >> > trying to do is to achieve what is not there with API i.e a workaroud. > >> > > >> > On Sun, Apr 11, 2010 at 11:06 PM, Benjamin Black <b...@b3k.us> wrote: > >> >> > >> >> A system that permits multiple people to have the same username has a > >> >> serious problem. > >> >> > >> >> On Sun, Apr 11, 2010 at 6:12 AM, vineet daniel < > vineetdan...@gmail.com> > >> >> wrote: > >> >> > How to handle same usernames. Otherwise seems fine to me. > >> >> > > >> >> > On Sun, Apr 11, 2010 at 6:17 PM, Dop Sun <su...@dopsun.com> wrote: > >> >> >> > >> >> >> Hi, > >> >> >> > >> >> >> > >> >> >> > >> >> >> As far as I can see it, the Cassandra API currently supports > >> >> >> criterias > >> >> >> on: > >> >> >> > >> >> >> Token – Key – Super Column Name (if applicable) - Column Names > >> >> >> > >> >> >> > >> >> >> > >> >> >> I guess Token is not usually used for the day to day queries, so, > >> >> >> Key > >> >> >> and > >> >> >> Column Names are normally used for querying. For the user name and > >> >> >> password > >> >> >> case, I guess it can be done like this: > >> >> >> > >> >> >> > >> >> >> > >> >> >> Define a CF as UserAuth with type as Super, and Key is user name, > >> >> >> while > >> >> >> password can be the SuperKeyName. So, while you receive the user > >> >> >> name > >> >> >> and > >> >> >> password from the UI (or any other methods), it can be queried > via: > >> >> >> multiget_slice or get_range_slices, if there are anything > returned, > >> >> >> means > >> >> >> that the user name and password matches. > >> >> >> > >> >> >> > >> >> >> > >> >> >> If not using the super column name, and put the password as the > >> >> >> column > >> >> >> name, the column name usually not used for these kind of > >> >> >> discretionary > >> >> >> values (actually, I don’t see any definitive documents on how to > use > >> >> >> the > >> >> >> column Names and Super Columns, flexibility is the good of > >> >> >> Cassandra, > >> >> >> or is > >> >> >> it bad if abused? :P) > >> >> >> > >> >> >> > >> >> >> > >> >> >> Not sure whether this is the best way, but I guess it will work. > >> >> >> > >> >> >> > >> >> >> > >> >> >> Regards, > >> >> >> > >> >> >> Dop > >> >> >> > >> >> >> > >> >> >> > >> >> >> From: Lucifer Dignified [mailto:vineetdan...@gmail.com] > >> >> >> Sent: Sunday, April 11, 2010 5:33 PM > >> >> >> To: user@cassandra.apache.org > >> >> >> Subject: Re: How to perform queries on Cassandra? > >> >> >> > >> >> >> > >> >> >> > >> >> >> Hi Benjamin > >> >> >> > >> >> >> I'll try to make it more clear to you. > >> >> >> We have a user table with fields 'id', 'username', and 'password'. > >> >> >> Now > >> >> >> if > >> >> >> use the ideal way to store key/value, like : > >> >> >> username : vineetdaniel > >> >> >> timestamp > >> >> >> password : <password> > >> >> >> timestamp > >> >> >> > >> >> >> second user : > >> >> >> > >> >> >> username: <seconduser> > >> >> >> timestamp > >> >> >> password:<password> > >> >> >> > >> >> >> and so on, here what i assume is that as we cannot make search on > >> >> >> values > >> >> >> (as confirmed by guys on cassandra forums) we are not able to > >> >> >> perform > >> >> >> robust > >> >> >> 'where' queries. Now what i propose is this. > >> >> >> > >> >> >> Rather than using a static values for column names use values > itself > >> >> >> and > >> >> >> unique key as identifier. So, the above example when put in as per > >> >> >> me > >> >> >> would > >> >> >> be. > >> >> >> > >> >> >> vineetdaniel : vineetdaniel > >> >> >> timestamp > >> >> >> > >> >> >> <password>:<password> > >> >> >> timestamp > >> >> >> > >> >> >> second user > >> >> >> seconduser:seconduser > >> >> >> timestamp > >> >> >> > >> >> >> password:password > >> >> >> timestamp > >> >> >> > >> >> >> By using above methodology we can simply make search on keys > itself > >> >> >> rather > >> >> >> than going into using different CF's. But to add further, this > >> >> >> cannot > >> >> >> be > >> >> >> used for every situation. I am still exploring this, and soon will > >> >> >> be > >> >> >> updating the group and my blog with information pertaining to > this. > >> >> >> As > >> >> >> cassandra is new, I think every idea or experience should be > shared > >> >> >> with the > >> >> >> community. > >> >> >> > >> >> >> I hope I example is clear this time. Should you have any queries > >> >> >> feel > >> >> >> free > >> >> >> to revert. > >> >> >> > >> >> >> On Sun, Apr 11, 2010 at 2:01 PM, Benjamin Black <b...@b3k.us> wrote: > >> >> >> > >> >> >> Sorry, I don't understand your example. > >> >> >> > >> >> >> On Sun, Apr 11, 2010 at 12:54 AM, Lucifer Dignified > >> >> >> <vineetdan...@gmail.com> wrote: > >> >> >> > Benjamin I quite agree to you, but what in case of duplicate > >> >> >> > usernames, > >> >> >> > suppose if I am not using unique names as in email id's . If we > >> >> >> > have > >> >> >> > duplicacy in usernames we cannot use it for key, so what should > be > >> >> >> > the > >> >> >> > solution. I think keeping incremental numeric id as key and > >> >> >> > keeping > >> >> >> > the > >> >> >> > name > >> >> >> > and value same in the column family. > >> >> >> > > >> >> >> > Example : > >> >> >> > User1 has password as 123456 > >> >> >> > > >> >> >> > Cassandra structure : > >> >> >> > > >> >> >> > 1 as key > >> >> >> > user1 - column name > >> >> >> > value - user1 > >> >> >> > 123456 - column name > >> >> >> > value - 123456 > >> >> >> > > >> >> >> > I m thinking of doing it this way for my applicaton, this way i > >> >> >> > can > >> >> >> > run > >> >> >> > different sorts of queries too. Any feedback on this is welcome. > >> >> >> > > >> >> >> > On Sun, Apr 11, 2010 at 1:13 PM, Benjamin Black <b...@b3k.us> > wrote: > >> >> >> >> > >> >> >> >> You would have a Column Family, not a column for that; let's > call > >> >> >> >> it > >> >> >> >> the Users CF. You'd use username as the row key and have a > >> >> >> >> column > >> >> >> >> called 'password'. For your example query, you'd retrieve row > >> >> >> >> key > >> >> >> >> 'usr2', column 'password'. The general pattern is that you > >> >> >> >> create > >> >> >> >> CFs > >> >> >> >> to act as indices for each query you want to perform. There is > >> >> >> >> no > >> >> >> >> equivalent to a relational store to perform arbitrary queries. > >> >> >> >> You > >> >> >> >> must structure things to permit the queries of interest. > >> >> >> >> > >> >> >> >> > >> >> >> >> b > >> >> >> >> > >> >> >> >> On Sat, Apr 10, 2010 at 8:34 PM, dir dir < > sikerasa...@gmail.com> > >> >> >> >> wrote: > >> >> >> >> > I have already read the API spesification. Honestly I do not > >> >> >> >> > understand > >> >> >> >> > how to use it. Because there are not an examples. > >> >> >> >> > > >> >> >> >> > For example I have a column like this: > >> >> >> >> > > >> >> >> >> > UserName Password > >> >> >> >> > usr1 abc > >> >> >> >> > usr2 xyz > >> >> >> >> > usr3 opm > >> >> >> >> > > >> >> >> >> > suppose I want query the user's password using SQL in RDBMS > >> >> >> >> > > >> >> >> >> > Select Password From Users Where UserName = "usr2"; > >> >> >> >> > > >> >> >> >> > Now I want to get the password using OODBMS DB4o Object Query > >> >> >> >> > and > >> >> >> >> > Java > >> >> >> >> > > >> >> >> >> > ObjectSet QueryResult = db.query(new Predicate() > >> >> >> >> > { > >> >> >> >> > public boolean match(Users Myusers) > >> >> >> >> > { > >> >> >> >> > return Myuser.getUserName() == "usr2"; > >> >> >> >> > } > >> >> >> >> > }); > >> >> >> >> > > >> >> >> >> > After we get the Users instance in the QueryResult, hence we > >> >> >> >> > can > >> >> >> >> > get > >> >> >> >> > the > >> >> >> >> > usr2's password. > >> >> >> >> > > >> >> >> >> > How we perform this query using Cassandra API and Java?? > >> >> >> >> > Would you tell me please?? Thank You. > >> >> >> >> > > >> >> >> >> > Dir. > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > On Sat, Apr 10, 2010 at 11:06 AM, Paul Prescod > >> >> >> >> > <p...@prescod.net> > >> >> >> >> > wrote: > >> >> >> >> >> > >> >> >> >> >> No. Cassandra has an API. > >> >> >> >> >> > >> >> >> >> >> http://wiki.apache.org/cassandra/API > >> >> >> >> >> > >> >> >> >> >> On Fri, Apr 9, 2010 at 8:00 PM, dir dir > >> >> >> >> >> <sikerasa...@gmail.com> > >> >> >> >> >> wrote: > >> >> >> >> >> > Does Cassandra has a default query language such as SQL in > >> >> >> >> >> > RDBMS > >> >> >> >> >> > and Object Query in OODBMS? Thank you. > >> >> >> >> >> > > >> >> >> >> >> > Dir. > >> >> >> >> >> > > >> >> >> >> >> > On Sat, Apr 10, 2010 at 7:01 AM, malsmith > >> >> >> >> >> > <malsm...@treehousesystems.com> > >> >> >> >> >> > wrote: > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> It's sort of an interesting problem - in RDBMS one > >> >> >> >> >> >> relatively > >> >> >> >> >> >> simple > >> >> >> >> >> >> approach would be calculate a rectangle that is X km by Y > >> >> >> >> >> >> km > >> >> >> >> >> >> with > >> >> >> >> >> >> User > >> >> >> >> >> >> 1's > >> >> >> >> >> >> location at the center. So the rectangle is UserX - > 10KmX > >> >> >> >> >> >> , > >> >> >> >> >> >> UserY-10KmY to > >> >> >> >> >> >> UserX+10KmX , UserY+10KmY > >> >> >> >> >> >> > >> >> >> >> >> >> Then you could query the database for all other users > where > >> >> >> >> >> >> that > >> >> >> >> >> >> each > >> >> >> >> >> >> user > >> >> >> >> >> >> considered is curUserX > UserX-10Km and curUserX < > >> >> >> >> >> >> UserX+10KmX > >> >> >> >> >> >> and > >> >> >> >> >> >> curUserY > >> >> >> >> >> >> > UserY-10KmY and curUserY < UserY+10KmY > >> >> >> >> >> >> * Not the 10KmX and 10KmY are really a translation from > >> >> >> >> >> >> Kilometers > >> >> >> >> >> >> to > >> >> >> >> >> >> degrees of lat and longitude (that you can find on a > >> >> >> >> >> >> google > >> >> >> >> >> >> search) > >> >> >> >> >> >> > >> >> >> >> >> >> With the right indexes this query actually runs pretty > >> >> >> >> >> >> well. > >> >> >> >> >> >> > >> >> >> >> >> >> Translating that to Cassandra seems a bit complex at > first > >> >> >> >> >> >> - > >> >> >> >> >> >> but > >> >> >> >> >> >> you > >> >> >> >> >> >> could > >> >> >> >> >> >> try something like pre-calculating a grid with the right > >> >> >> >> >> >> resolution > >> >> >> >> >> >> (like a > >> >> >> >> >> >> square of 5KM per side) and assign every user to a > >> >> >> >> >> >> particular > >> >> >> >> >> >> grid > >> >> >> >> >> >> ID. > >> >> >> >> >> >> That > >> >> >> >> >> >> way you just calculate with grid ID User1 is in then do a > >> >> >> >> >> >> direct > >> >> >> >> >> >> key > >> >> >> >> >> >> lookup > >> >> >> >> >> >> to get a list of the users in that same grid id. > >> >> >> >> >> >> > >> >> >> >> >> >> A second approach would be to have to column families -- > >> >> >> >> >> >> one > >> >> >> >> >> >> that > >> >> >> >> >> >> maps > >> >> >> >> >> >> a > >> >> >> >> >> >> Latitude to a list of users who are at that latitude and > a > >> >> >> >> >> >> second > >> >> >> >> >> >> that > >> >> >> >> >> >> maps > >> >> >> >> >> >> users who are at a particular longitude. You could do > the > >> >> >> >> >> >> same > >> >> >> >> >> >> rectange > >> >> >> >> >> >> calculation above then do a get_slice range lookup to get > a > >> >> >> >> >> >> list > >> >> >> >> >> >> of > >> >> >> >> >> >> users > >> >> >> >> >> >> from range of latitude and a second list from the range > of > >> >> >> >> >> >> longitudes. > >> >> >> >> >> >> You would then need to do a in-memory nested loop to find > >> >> >> >> >> >> the > >> >> >> >> >> >> list > >> >> >> >> >> >> of > >> >> >> >> >> >> users > >> >> >> >> >> >> that are in both lists. This second approach could cause > >> >> >> >> >> >> some > >> >> >> >> >> >> trouble > >> >> >> >> >> >> depending on where you search and how many users you > really > >> >> >> >> >> >> have > >> >> >> >> >> >> -- > >> >> >> >> >> >> some > >> >> >> >> >> >> latitudes and longitudes have many many people in them > >> >> >> >> >> >> > >> >> >> >> >> >> So, it seems some version of a chunking / grid id thing > >> >> >> >> >> >> would > >> >> >> >> >> >> be > >> >> >> >> >> >> the > >> >> >> >> >> >> better approach. If you let people zoom in or zoom out > - > >> >> >> >> >> >> you > >> >> >> >> >> >> could > >> >> >> >> >> >> just > >> >> >> >> >> >> have different column families for each level of zoom. > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> I'm stuck on a stopped train so -- here is even more > code: > >> >> >> >> >> >> > >> >> >> >> >> >> static Decimal GetLatitudeMiles(Decimal lat) > >> >> >> >> >> >> { > >> >> >> >> >> >> Decimal f = 0.0M; > >> >> >> >> >> >> lat = Math.Abs(lat); > >> >> >> >> >> >> f = 68.99M; > >> >> >> >> >> >> if (lat >= 0.0M && lat < 10.0M) { f = 68.71M; } > >> >> >> >> >> >> else if (lat >= 10.0M && lat < 20.0M) { f = 68.73M; } > >> >> >> >> >> >> else if (lat >= 20.0M && lat < 30.0M) { f = 68.79M; } > >> >> >> >> >> >> else if (lat >= 30.0M && lat < 40.0M) { f = 68.88M; } > >> >> >> >> >> >> else if (lat >= 40.0M && lat < 50.0M) { f = 68.99M; } > >> >> >> >> >> >> else if (lat >= 50.0M && lat < 60.0M) { f = 69.12M; } > >> >> >> >> >> >> else if (lat >= 60.0M && lat < 70.0M) { f = 69.23M; } > >> >> >> >> >> >> else if (lat >= 70.0M && lat < 80.0M) { f = 69.32M; } > >> >> >> >> >> >> else if (lat >= 80.0M) { f = 69.38M; } > >> >> >> >> >> >> > >> >> >> >> >> >> return f; > >> >> >> >> >> >> } > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> Decimal MilesPerDegreeLatitude = > >> >> >> >> >> >> GetLatitudeMiles(zList[0].Latitude); > >> >> >> >> >> >> Decimal MilesPerDegreeLongitude = ((Decimal) > >> >> >> >> >> >> Math.Abs(Math.Cos((Double) > >> >> >> >> >> >> zList[0].Latitude))) * 24900.0M / 360.0M; > >> >> >> >> >> >> dRadius = 10.0M // ten miles > >> >> >> >> >> >> Decimal deltaLat = dRadius / MilesPerDegreeLatitude; > >> >> >> >> >> >> Decimal deltaLong = dRadius / MilesPerDegreeLongitude; > >> >> >> >> >> >> > >> >> >> >> >> >> ps.TopLatitude = zList[0].Latitude - deltaLat; > >> >> >> >> >> >> ps.TopLongitude = zList[0].Longitude - deltaLong; > >> >> >> >> >> >> ps.BottomLatitude = zList[0].Latitude + deltaLat; > >> >> >> >> >> >> ps.BottomLongitude = zList[0].Longitude + deltaLong; > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> On Fri, 2010-04-09 at 16:30 -0700, Paul Prescod wrote: > >> >> >> >> >> >> > >> >> >> >> >> >> 2010/4/9 Onur AKTAS <onur.ak...@live.com>: > >> >> >> >> >> >> > ... > >> >> >> >> >> >> > I'm trying to find out how do you perform queries with > >> >> >> >> >> >> > calculations > >> >> >> >> >> >> > on > >> >> >> >> >> >> > the > >> >> >> >> >> >> > fly without inserting the data as calculated from the > >> >> >> >> >> >> > beginning. > >> >> >> >> >> >> > Lets say we have latitude and longitude coordinates of > >> >> >> >> >> >> > all > >> >> >> >> >> >> > users > >> >> >> >> >> >> > and > >> >> >> >> >> >> > we > >> >> >> >> >> >> > have > >> >> >> >> >> >> > Distance(from_lat, from_long, to_lat, to_long) > function > >> >> >> >> >> >> > which > >> >> >> >> >> >> > gives distance between lat/longs pairs in kilometers. > >> >> >> >> >> >> > >> >> >> >> >> >> I'm not an expert, but I think that it boils down to > >> >> >> >> >> >> "MapReduce" > >> >> >> >> >> >> and > >> >> >> >> >> >> "Hadoop". > >> >> >> >> >> >> > >> >> >> >> >> >> I don't think that there's any top-down tutorial on those > >> >> >> >> >> >> two > >> >> >> >> >> >> words, > >> >> >> >> >> >> you'll have to research yourself starting here: > >> >> >> >> >> >> > >> >> >> >> >> >> * http://en.wikipedia.org/wiki/MapReduce > >> >> >> >> >> >> > >> >> >> >> >> >> * http://hadoop.apache.org/ > >> >> >> >> >> >> > >> >> >> >> >> >> * http://wiki.apache.org/cassandra/HadoopSupport > >> >> >> >> >> >> > >> >> >> >> >> >> I don't think it is all documented in any one place > yet... > >> >> >> >> >> >> > >> >> >> >> >> >> Paul Prescod > >> >> >> >> >> >> > >> >> >> >> >> > > >> >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > >> >> >> > >> >> > > >> > > >> > > > > > >