Good idea CCing the list, I initially thought it was a topic of
limited general relevance, but I realized at least Mike might also
have something to say here..


On Sat, Sep 29, 2012 at 2:31 AM, Ralph Versteegen <teeem...@gmail.com> wrote:
> On 29 September 2012 00:38, David Gowers <00a...@gmail.com> wrote:
>
> I found that doing seeks on a file while writing to it is very slow on
> Windows XP; apparently the libc implementation flushes its cache when
> you seek. glibc doesn't do this. So I actually wrote a file write
... Wow, they like to punish databases?. I'd guess that nobody hosts
websites on WinXP machines, but that'd be a wrong guess >_<.


> buffering layer used when writing RELOAD documents to avoid this.
> Loading RELOAD documents was also considerably slower in Win XP than
> in Linux; I don't why, but it was still perfectly acceptable.

I'm currently researching this, as far as I know Python does some
buffering on its side of things, so how much of an impact this has
needs to be determined. My net access is kind of FUBAR, I'll get there
eventually though.

...

Looks like Python definitely does do its own buffering:

http://docs.python.org/library/io.html#io.BufferedIOBase

The only reference I've been able to find to slow seek() in python is
this one guy who was performing the inanity:

for i in range(1000000):
    f.seek(0)

/-_-\ .

Anyway I'm curious, so I'll write a PlanarRecordManager subclass that
caches the entire content of the relevant super-record, then they can
be swapped in and out to compare.

>> (the function that 'with' blocks otherwise perform, but for whenever a
>> 'with' block is inappropriate). Much like you can do with file
>> handles.
>
> OK, but littering code with lots of close()s isn't too pretty.
In practice so far close() doesn't get much usage, versus with blocks.
It won't go away (the context management used by 'with' will use it)
but there's no obligation to. I only use explicit close()s where
indentation levels would otherwise get excessive.
(again, just like would be the case with filehandles)

> Fine, it is clearer, ever so slightly!
If you don't like, don't use. It's no problem.
I designed it because I much prefer it.

> Sketching DSLs can be a lot of fun (but turns into frustration for me
> due to perfectionism. They usually can't be the silver bullet you
> want).
Yeah, even more so the more powerful the source language is. You end
up realizing that you're looking at implementing features that are
already done well in the host language, better than any quick hack
could be.

>> You're right here. I actually went with the 'sub' idea initially
>> because it distinguished subsections clearly, but another convention
>> can probably handle that.
>
> That could be a useful convention, but there's no gain in restricting
> it to just RELOADWriter.
Hm. Point.

>
>>>
>>> The ZoneMap class should definitely have name and extra/extradata
>>> members rather than obscuring that in an 'info' tuple.
>>
>> So what, you want a name[N_ZONES] dict and an extradata[N_ZONES]
>> dict instead? Eh, okay.
>
> I didn't realise .info was indexed by zone id. That explains why you
> did it that way. In that light, slightly prefer .info, as long as you
> make .info elements namedtuples instead of a tuples.
Sure, a lot of that stuff is provisional anyway, I just put things in
containers that seem
reasonable, and wait for usage to tell me if adjustments are needed.

Updated status:

* 10 classes are done (most of them written from scratch this month)
* 3 are WIP/ need more testing
* 20 are not started yet.

...(tile|foe|wall)maps got finished quickly.



I've  been looking at how to do streaming RELOAD.
the general pattern I'm currently looking at:

statsby = parentnode.when('stats_by')
if statsby:
    # do stuff with statsby

and to explicitly get the next node, just iterate:

node = next(parentnode)


Related ideas:
* No need for another class, the difference in behaviour is minor.
* precache node info on the current level. This allows to process
nodes 'out-of-order' -- if you request a node that is after the
'current' node, you can still get it.. and process the other one
whenever it occurs in your code.
* precaching based on a size threshold: if the size specified on a
parent node is smaller than N bytes, precache all its children.
* saving needs some thought, mainly we need to be able to create
'detached' nodes and 'attach' them later when dealing with complex
structures.

In general this describes a 'mixed' model, where we stream when we
must, and otherwise act like we have random access. I haven't decided
on the quality of this model yet. It would raise an error when sizes
are drastically out of wack (eg. a container node you expect to be
1000 bytes is 100000 instead)
_______________________________________________
Ohrrpgce mailing list
ohrrpgce@lists.motherhamster.org
http://lists.motherhamster.org/listinfo.cgi/ohrrpgce-motherhamster.org

Reply via email to