On Wed, Sep 1, 2010 at 10:43 AM, Ingo Schmid <[email protected]> wrote: > May I ask how many is a lot?
There are two scenarios -- In one, I already have the data, and it is a lot of data. While I don't know the exact numbers, my estimate is that they would weigh in somewhere around 800 GB. I would like to organize them in separate piddles (logically, each piddle would be one image) that would weigh in at around 90 MB. Since, at any given time, I would be looking at only a small range in each piddle, it just seems a waste to be slogging through 90 MB of data. Not a problem in a one off process, but in a web based process with multiple users, that could easily become a hog. In the second scenario, the piddle size is actually tiny, but piddles are created on an ad hoc basis, for different parts of the country. Once created, they are stored, so they are separate by nature of creation, not lumped together. So, either way, being able to identify the specific piddle required based on a meta-index would be a useful capability. > I process imaging data, i.e. stacks of 3D > images all concatenated into a 4D piddle. If you look for for my previous > message(s) on this list, there is a limit to 2GB for a single piddle and > some suggestions to patch Core.pm to push it further. If your memory is > large enough to hold the piddle stick to it and enjoy slicing, dicing and > threading! If it's much bigger than memory, that's a different story, > > Ingo > > On 09/01/2010 05:27 PM, P Kishor wrote: >> >> (near future scenario) I have a lot of piddles covering contiguous >> rectangular areas. I could stitch them up together, but then, I would >> have a one very large piddle. So, I leave them the way they are. The >> user supplies a pair of coordinate pairs which lets me identify the >> piddle we want, open it up, use range() to extract the >> area-of-interest (AOI), and analyze it. I can do the identification of >> the piddle either based on some naming scheme I can develop, or by >> storing some kind of area->to->name index. (sidenote: Of course, I can >> do this identification task with PostGIS/Postgres, or with >> SQLite+R*Tree, but I am hoping for an all PDL solution, or rather, a >> NoSQL solution). So, that is the first problem... a name->to->area >> index is easy, but an arbitrary_area->to->name index is difficult. >> >> Second problem -- what if the arbitrary_area->to->name index returns >> multiple piddles, as in, an AOI that overlaps several piddles? So, >> first, the aribitrary_area->to->name index should be able to return >> multiple piddles. Then, my program should be able to extract the >> various smaller regions from the identified piddles and glue (or >> append) them together into a piddle of the AOI, cache it temporarily, >> and do analysis on it. >> >> I am thinking... this could be done with some kind of quad-tree >> indexing scheme. Has this been done already? If not, suggestions on >> how to proceed with this would be much welcome. >> > > -- Puneet Kishor http://www.punkish.org Carbon Model http://carbonmodel.org Charter Member, Open Source Geospatial Foundation http://www.osgeo.org Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor Nelson Institute, UW-Madison http://www.nelson.wisc.edu ----------------------------------------------------------------------- Assertions are politics; backing up assertions with evidence is science ======================================================================= _______________________________________________ Perldl mailing list [email protected] http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
