On 8/7/10 4:22 AM, Thomas Heller wrote:
Ok, I think the part I was missing was the concatenation of the key and
partition to do the look ups. Is this the preferred way of accomplishing
needs such as this? Are there alternatives ways?
Depending on your needs you can concat the row key or use super columns.

How would one then "query" over multiple days? Same question for all days.
Should I use range_slice or multiget_slice? And if its range_slice does that
mean I need OrderPreservingPartitioner?
The last 3 days is pretty simple: ['2010-08-07', '2010-08-06',
'2010-08-05'], as is 7, 31, etc. Just generate the keys in your app
and use multiget_slice.

If you want to get all days where a specific ip address had some
requests you'll just need another CF where the row key is the addr and
column names are the days (values optional again). Pretty much the
same all over again, just add another CF and insert the data you need.

get_range_slice in my experience is better used for "offline" tasks
where you really want to process every row there is.

/thomas
Ok... as an example using looking up logs by ip for a certain timeframe/range would this work?

<ColumnFamily Name="SearchLog"/>

<ColumnFamily Name="IPSearchLog"
                           ColumnType="Super"
                           CompareWith="UTF8Type"
                           CompareSubcolumnsWith="TimeUUIDType"/>

Resulting in a structure like:

{
  "127.0.0.1" : {
       "2010080711" : {
            uuid1 : ""
            uuid2: ""
            uuid3: ""
       }
      "2010080712" : {
            uuid1 : ""
            uuid2: ""
            uuid3: ""
       }
   }
  "some.other.ip" : {
       "2010080711" : {
            uuid1 : ""
       }
   }
}

Whereas each uuid is the key used for SearchLog. Is there anything wrong with this? I know there is a 2 billion column limit but in this case that would never be exceeded because each column represents an hour. However does the above "schema" imply that for any certain IP there can only be a maxium of 2GB of data stored?

Reply via email to