Re: [Perldl] building a quad-tree like index with PDL

P Kishor Wed, 01 Sep 2010 09:04:29 -0700

On Wed, Sep 1, 2010 at 10:43 AM, Ingo Schmid <[email protected]> wrote:
>  May I ask how many is a lot?


There are two scenarios --

In one, I already have the data, and it is a lot of data. While I
don't know the exact numbers, my estimate is that they would weigh in
somewhere around 800 GB. I would like to organize them in separate
piddles (logically, each piddle would be one image) that would weigh
in at around 90 MB.

Since, at any given time, I would be looking at only a small range in
each piddle, it just seems a waste to be slogging through 90 MB of
data. Not a problem in a one off process, but in a web based process
with multiple users, that could easily become a hog.


In the  second scenario, the piddle size is actually tiny, but piddles
are created on an ad hoc basis, for different parts of the country.
Once created, they are stored, so they are separate by nature of
creation, not lumped together.

So, either way, being able to identify the specific piddle required
based on a meta-index would be a useful capability.



> I process imaging data, i.e. stacks of 3D
> images all concatenated into a 4D piddle. If you look for for my previous
> message(s) on this list, there is a limit to 2GB for a single piddle and
> some suggestions to patch Core.pm to push it further. If your memory is
> large enough to hold the piddle stick to it and enjoy slicing, dicing and
> threading! If it's much bigger than memory, that's a different story,
>
> Ingo
>
> On 09/01/2010 05:27 PM, P Kishor wrote:
>>
>> (near future scenario) I have a lot of piddles covering contiguous
>> rectangular areas. I could stitch them up together, but then, I would
>> have a one very large piddle. So, I leave them the way they are. The
>> user supplies a pair of coordinate pairs which lets me identify the
>> piddle we want, open it up, use range() to extract the
>> area-of-interest (AOI), and analyze it. I can do the identification of
>> the piddle either based on some naming scheme I can develop, or by
>> storing some kind of area->to->name index. (sidenote: Of course, I can
>> do this identification task with PostGIS/Postgres, or with
>> SQLite+R*Tree, but I am hoping for an all PDL solution, or rather, a
>> NoSQL solution). So, that is the first problem... a name->to->area
>> index is easy, but an arbitrary_area->to->name index is difficult.
>>
>> Second problem -- what if the arbitrary_area->to->name index returns
>> multiple piddles, as in, an AOI that overlaps several piddles? So,
>> first, the aribitrary_area->to->name index should be able to return
>> multiple piddles. Then, my program should be able to extract the
>> various smaller regions from the identified piddles and glue (or
>> append) them together into a piddle of the AOI, cache it temporarily,
>> and do analysis on it.
>>
>> I am thinking... this could be done with some kind of quad-tree
>> indexing scheme. Has this been done already? If not, suggestions on
>> how to proceed with this would be much welcome.
>>
>
>



-- 
Puneet Kishor http://www.punkish.org
Carbon Model http://carbonmodel.org
Charter Member, Open Source Geospatial Foundation http://www.osgeo.org
Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor
Nelson Institute, UW-Madison http://www.nelson.wisc.edu
-----------------------------------------------------------------------
Assertions are politics; backing up assertions with evidence is science
=======================================================================

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Re: [Perldl] building a quad-tree like index with PDL

Reply via email to