Re: [Caml-list] Generating coinductive elements

2012-04-30 Thread Xavier Leroy
Hello Jean-Baptiste,

> Let us take an example using streams (infinite lists) of numbers:
> 
> type stream = Cons of int * stream

You won't go far with this type for integer streams, as the only way
to construct values of this type is "let rec", and it's severely
restricted (because of call-by-value requirements).  For instance, you
can't even define the stream of all integers 0.1.2.3.4...

The proper OCaml type for integer streams is

type stream = Cons of int * stream Lazy.t

and with this type you should be able to do pretty much everything you
want to do with your equations.

HTH,

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Re: [oss-security] CVE request: Hash DoS vulnerability (ocert-2011-003)

2012-03-14 Thread Xavier Leroy
Gerd Stolpmann writes:
> The Random module is definitely not good enough (e.g. if you
> know when the program was started like for a cgi, and the cgi reveals
> information it should better not like the pid, the Random seed is made
> from less than 10 unpredictable bits, and on some systems even 0 bits).

Dario Teixeira adds:
> I think the problem may be in finding a good source of randomness
> that is common across all OSes.  In Unixland this problem has
> largely been solved: pretty much everyone supports /dev/random and
> /dev/urandom.  Windows does things differently, however.

David Allsopp adds:
> Does the source of randomness have to be common? The decision to use
> a random seed doesn't need to be limited by a problem getting a good
> cryptographically secure generator on a given OS - you'd simply
> document that the implementation on that particular OS doesn't seed
> with a good PRNG and await a patch from someone who may care in the
> future, but at least the philosophy behind the decision is correct!

We are also thinking of strengthening Random.self_init, for instance
by using /dev/urandom when available.  This said, for randomizing
hashtables or other data structures, we do *not* need a
cryptographically-strong PRNG: we're not generating an RSA key pair or
some other situation where cryptographic quality is required; we're
just making a mild DOS attack impractical.

(Obligatory advertisement: if you're in need of
cryptographically-strong random data,
http://forge.ocamlcore.org/projects/cryptokit/
is what you need.)

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Association lists

2012-03-12 Thread Xavier Leroy
On 03/12/2012 07:12 PM, Lukasz Stafiniak wrote:

>> Resignedly awaiting a CVE about association lists,
> 
> Is using association lists a lot "poor style"? Wouldn't it be better
> to use maps -- which would make it possible to throw in different
> implementations to tune performance?

I was joking, but to answer seriously:

Association lists have O(1) insertion time but O(n) lookup time.  So,
you can use them as long as you're sure they are pretty short.  If
you're not sure, e.g. if malicious users of your program can grow the
a-list as much as they want, better use maps, indeed.

The joke was that we don't need a CVE to know this, just basic
algorithmic reasoning.

- Xavier Leroy




-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Re: [oss-security] CVE request: Hash DoS vulnerability (ocert-2011-003)

2012-03-12 Thread Xavier Leroy
On 03/10/2012 08:31 AM, Richard W.M. Jones wrote:

>> Rather than changing every app that uses Hashtbl, I'd prefer to fix
>> this upstream by choosing a random seed for hash tables unless the
>> caller explicitly sets one or sets an environment variable to disable
>> this.
>>
>> In Perl, the seed is a random number chosen when the Perl interpreter
>> starts up.  This is low overhead, but still leaves a (much more
>> theoretical) attack where someone can determine the seed from a
>> long-running process using some other method and still attack the hash
>> table.
>>
>> In Python there is an environment variable you can set to disable
>> randomized hash tables.  Further Python discussion here:
>> http://bugs.python.org/issue13703
>> http://mail.python.org/pipermail/python-dev/2012-January/thread.html#115465
> 
> No comment at all?  This is an exploitable CVE ...

As you and Gerd said, the new Hashtbl implementation in the upcoming
major release has everything needed to randomize hash tables by
seeding.  The question at this point is whether randomization should
be the default or not: some of our big users who don't do Web stuff
value reproducibility highly...  We (OCaml core developers) will take
a decision soon.

Musing: there is something strange about saying that a data structure
has a DOS vulnerability.  It's a bit like saying that a steak knife
has homicidal intent.  Web-facing applications that use the wrong data
structure have vulnerabilities; the data structure does not.  And,
even randomized, a hashtable still has O(n) worst-case behavior...

Gerd Stolpmann adds:

> Currently, the only way for library developers to fix their product for
> 3.12 is to restrict the size of the hashtables coming from untrusted
> sources.

A much better fix is to replace your hash tables with references to
AVL maps.  Guaranteed O(log n) is the way to go for Web app developers
to sleep soundly at night.

Resignedly awaiting a CVE about association lists,

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Custom let bindings

2012-01-22 Thread Xavier Leroy
On 01/20/2012 01:39 PM, Mehdi Dogguy wrote:

> I noticed that Alain Frisch tried to add custom let bindings (see r11894
> and r11906) but it was reverted later on (see r11960) because no
> consensus was reached (among OCaml Core team, I guess). 

Yes.  We Caml developers spent a lot of time arguing about the merits and
demerits of various syntaxes for monadic lets and other forms of
generalized let bindings, discussing usability, generality, etc.
There were some unexpected surprises, e.g. Lwt's concurrency monad
needing a special monadic "let ... and ... in ..."  binding, which
makes sense for this particular monad but not for others.

In the end, we decided that none of the proposals is something we can
commit on and put (forever) in the core language.  This is one of the
cases that is currently best handled by Camlp4 syntax extensions,
either specific to a monad (Lwt) or more general (pa_monad).

Regards,

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Will the data allocated by caml_alloc be automatically freed by g arbage collector?

2012-01-03 Thread Xavier Leroy
On 01/01/2012 05:01 PM, syshen wrote:

> So in this case I think the memory usage of my program should be the same
> like before calling minisat_save_proof , because the returned data should be
> collected by the garbage collector when exiting the method A to the main loop.
> But from the unix "top" command, I find that these huge return data seems to
> remain in memory and consume all my memory step by step.

There are several possible explanations:

1- Your program is actually keeping a reference to this big piece of
data, so it cannot be reclaimed.

2- You're falling into PR#5389, http://caml.inria.fr/mantis/view.php?id=5389
whereas OCaml's compactor can fail to return some memory to the malloc()
subsystem.

3- The malloc() implementation on your system elected not to / was
unable to return memory to the OS, hence the numbers reported by
"top" stay constant.

What happens if you run your test in an infinite loop?  Does memory
usage (as reported by "top") stays constant or increases as a function
of time?  Only the latter case is really problematic.

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Hashtbl and security

2012-01-01 Thread Xavier Leroy
On 01/01/2012 01:52 PM, Richard W.M. Jones wrote:
> On Fri, Dec 30, 2011 at 06:06:26PM +0100, Xavier Leroy wrote:
>> Indeed.  The optional "seed" parameter to Hashtbl.create does exactly
>> this in the new implementation of Hashtbl (the one based on Murmur3).
> 
> It may be worth noting that Perl solved this problem (back in 2003) by
> unconditionally using a seed which is a global set to a random number
> during interpreter initialization.  

That's how my initial reimplementation of Hashtbl worked, using the
Random module to produce seeds, but I was told (correctly) that in
security-sensitive applications it's better to leave the generation of
random numbers under control of the programmer.  For some applications
Random.self_init might be good enough and for others stronger
randomness is needed.

Of course, you can trivially emulate Perl's behavior using the new
Hashtbl implementation + the Random module.

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Hashtbl and security

2011-12-30 Thread Xavier Leroy
On 12/30/2011 05:44 PM, Gerd Stolpmann wrote:

> 1) Avoid hash tables in contexts where security is relevant. The
> alternative is Set (actually a balanced binary tree), which does not
> show this problem.

Highly recommended.  Nothing beats guaranteed O(log n) operations.

> 2) Use cryptographically secure hash functions.

Hopeless: with a hash size of 30 bits, as in Caml, or even 64 bits,
there are no cryptographically secure hash functions.

> 3) Use "randomized" hash tables. The trick here is that there is not a
> single hash function h anymore, but a family h(1)...h(n). When the hash
> table is created, one of the functions is picked randomly. This makes it
> impossible to craft an attack request, because you cannot predict the
> function. 

Indeed.  The optional "seed" parameter to Hashtbl.create does exactly
this in the new implementation of Hashtbl (the one based on Murmur3).

> So, the question is how to do 3). I see two problems here:
> 
> a) how to define the family of hash functions. Is it e.g. sufficient to
> introduce an initialization vector for the Murmurhash algorithm, and
> fill it randomly?

IIRC, the Web pages for the Murmur family of hashes gives some
statistical evidence that this approach works.

> How to get a random number that is good enough?

Hmm.  /dev/random is your friend on the platforms that support it.
Otherwise, there's always the Random module, but Random.self_init
isn't very strong.

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] OCaml maintenance status / community fork (again)

2011-12-12 Thread Xavier Leroy
On 12/10/2011 04:58 PM, Benedikt Meurer wrote:

> Maybe we can get back to my original proposal and restart on the
> right foot this time?

All right.  I have a number of concerns about your proposal, most of
which have already been mentioned by others.  Whether you call it a
fork or not, the mere fact of having two separate code bases for
exactly the same software components raises more issues than it solves:

- It complicates the lives of OCaml users, packagers, and 3rd-party
  library developers: what version should they use?  what will be the
  basis for the packagers's distribution-specific patches?  what
  happens if a library is compatible with one version but not the
  other?  what if the community effort runs out of steam in a few years?

- It means additional efforts with some duplication.  The
  synchronization between the "community" and the "reference" code
  bases will take work.  From the core team's standpoint, that's about
  as much work as dealing with individual suggestions and
  contributions.  At the same time, the community team will spend
  non-negligible time organizing and administering itself, at least
  initially.

- It would be a PR disaster.  I struggled (and still have to struggle)
  with my INRIA management to get a little bit of support for OCaml's
  development, e.g. Xavier Clerc's part-time assignment to work on
  OCaml.  This support is fragile and can easily disappear if there's
  a perception that "the community" will take care of that anyway.
  Likewise, I don't quite see how to explain the situation to the
  members of the Caml consortium.

For these reasons, I feel that the cure is worse than the illness.
I am, however, much more sympathetic to your other proposal, namely:

> Why not accept a model similar to i.e. the NetBSD project, with a
> lot of committers (experts in their own areas) and 2-3 people to
> keep an eye on the overall direction?

Except for the "lot of committers" part, this is pretty much how the
core Caml team has been working, and there is definitely room for more
contributors.  As I mentioned earlier, there are a number of areas
where we're lacking manpower, e.g. Windows-related stuff and tools
such as the debugger or Camlp4, but many other areas of the system
would benefit from more contributors too.  Smaller contributions
are most welcome as well, such as commenting on the PRs, testing
changes, new features and release candidates, PR triaging, helping to
identify the top little things to be resolved by the next release,
etc.

That's a much more low-key approach, but one that is more likely to
succeed.  Volunteers are welcome to contact us as c...@inria.fr.

- Xavier Leroy






-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Some comments on recent discussions

2011-12-10 Thread Xavier Leroy
On 12/07/2011 12:18 PM, Gabriel Scherer wrote:

>> The French book "Le langage Caml" is very great, althought it is quite old,
>> and althought examples used in the book (write a pascal compiler, a grep
>> tool and so on) is maybe too theoristic for engineer target.
>> Maybe a translation would be sufficient ?
> 
> I have contacted Xavier Leroy and Pierre Weis a few years ago to get
> the TeX sources of the book. I didn't intend to translate from French
> to English, but only, for a start, from Caml Light to Objective Caml.
> Neither of them could find the sources

There must have been a misunderstanding: I have the full LaTeX
sources, of course.  What may have been lost is Pierre's first cut at
an OCaml version of the book.

Alan Schmitt adds:

> Yes, it's a great start to write a course on Caml. (I realize that
> the online version is the second edition, yet it is shorter than the
> paper version I have here. Either some chapters were removed in the
> 2nd edition or they are not available online.)

No, no, the online edition is identical to the 2nd edition, which has
one additional chapter.  It's just that pages are wider and taller
than those of the 1st edition, resulting in fewer pages overall.

David Mentré adds:

> Unfortunately, it is available under a non Free license (Non
> Commercial clause of CC-BY-NC-SA licence), this is a show stopper for
> me. I would like such a book to be included in Debian or other Free
> Software distributions.

I stand very firmly by the NC license.  If anyone wants to make money
out of this book or a derived work, the original authors and all the
future contributors should get their share, Debian orthodoxy
notwithstanding.

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] OCaml maintenance status / community fork (again)

2011-12-10 Thread Xavier Leroy
se it is often hard to guess who cares about this or that
suggestion, or even what problem it is supposed to address.  Volunteers
could greatly help by simply commenting on the PRs in Mantis, to express
support or disagreement, or to ask for clarifications.

5- Before embarking on patching the core Caml distribution, it wouldn't hurt
to ask (privately) where the priorities are.  For instance, I don't see the
point for a linear-scan allocator (Benedikt Meurer) or more efficient
compilation of value let-rec (Fabrice le Fessant), but anyone who would come
up with a GHC-quality function inliner would be welcome like the Messiah.
Likewise, for many years I've been looking for developers to work on the
Windows port(s) of OCaml and never found any.  Finally, at the latest OCaml
consortium meeting, the idea of splitting Camlp4 off the core distribution
was floated around; volunteers to take over its maintenance would be most
welcome.

All right.  Let me stop here and pray for constructive, non-knee-jerk reactions.

- Xavier Leroy




-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] How to write an efficient interpreter

2011-10-24 Thread Xavier Leroy
On 10/24/2011 11:10 AM, Diego Olivier Fernandez Pons wrote:

> I have to write an interpreter for a datatype rich purely applicative
> language. I have already written a naive interpreter (like in programming
> languages class) and was wondering what where the options for writing
> something that would perform better while keeping it maintainable by a single
> person < 5% dedicated and preferably only in core-ML (no C code or fancy ML
> extensions).

Gabriel and Gerd gave very good advice already, but here are my two
cents:

- Beware of premature optimizations.  It could very well be that your
  naive interpreter is already fast enough for your applications.
  Test and profile to find hot spots.

- Pay special attention to your data structures, e.g. environments should
  be maps, not A-lists.  A good trick is to hash-cons identifiers
  into unique integers: not only comparisons are much faster,
  but you can use Patricia trees for your environments.

- Compiling to bytecode is probably overkill.

Hope this helps.

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] How to simplify an arithmetic expression ?

2011-10-02 Thread Xavier Leroy
On Sun, Oct 2, 2011 at 10:08 AM, Gabriel Scherer wrote:
> One approach I like for such simplifications is the "normalization by
> evaluation" approach.

NBE is neat, but I'm skeptical that it will work out of the box here:
if you apply NBE to a standard evaluator for arithmetic expressions,
it's not going to take advantage of associativity and distributivity
the way Diego wants.

On 10/02/2011 05:09 PM, Ernesto Posse wrote:
> In general, whenever you have an algebraic
> structure with normal forms, normal forms can be obtained by
> equational reasoning: using the algebra's laws as rewriting rules. 

Yes, writing down a system of equations is the first thing to do.
But, to obtain a normalization procedure, you need to orient those
rules and complete them (in the sense of Knuth-Bendix completion) with
extra rules to derive a confluent, terminating rewriting system.

Here is a good, down-to-earth introduction to Knuth-Bendix completion:

A.J.J. Dick, "An Introduction to Knuth-Bendix Completion"
http://comjnl.oxfordjournals.org/content/34/1/2.full.pdf

And here is a solid textbook on rewriting systems:

Franz Baader and Tobias Nipkow. "Term Rewriting and All That".
http://www4.in.tum.de/~nipkow/TRaAT/

Hope this helps,

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Strange crash after switching to first class modules

2011-09-29 Thread Xavier Leroy
On 09/29/2011 06:03 PM, John Carr wrote:

> One sentence summary: I added a local module generated by
> functor application to an unpacked first class module and
> the runtime blew up. [...]
> The last active code in my source is a function that looks like this
> in relevant part:
> 
>   let format chan x =
> let f = Printf.fprintf chan "%s %d" in
>   begin match x with
> | W (n,false) -> f "string" n
>   end
> 
> Note the partial application of fprintf.

Could you first try to eliminate that partial application of fprintf?
(Inline f at point of call, or eta-expand yourself.)  The Printf
module takes a lot of liberties with the type system (ahem), so it is
one of the usual suspects in such a case.

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



[Caml-list] [Ann] Zarith

2011-08-18 Thread Xavier Leroy
Dear list,

It is my pleasure to announce release 1.0 of Zarith, a new OCaml
library for exact, arbitrary-precision arithmetic on integers and
rationals.  Zarith was written by Antoine Miné with a little help from
me and feedback from Pascal Cuoq.

To download:  http://forge.ocamlcore.org/projects/zarith/

The implementation uses GMP (the GNU Multiple Precision arithmetic
library) to compute over big integers.  However, small integers are
represented as unboxed Caml integers, to save space and improve
performance. Big integers are allocated in the Caml heap, bypassing
GMP's memory management and achieving better GC behavior than e.g.
the MLGMP library.  Computations on small integers use a special,
faster path (coded in assembly for some platforms and functions)
eschewing calls to GMP, while computations on large intergers use the
low-level MPN functions from GMP.  As a consequence, Zarith is much
faster and more space-efficient than the Big_int module from OCaml's
standard distribution.

Additional niceties of Zarith include:
- short function names and infix operators, allowing one to write e.g.
Z.(~$2 + ~$5 * x)
- polymorphic comparisons (=, <, >, etc) work correctly on Zarith's
  big integers, provided OCaml 3.12.1 or later is used.

Feedback is welcome, preferably through the bug tracker at
http://forge.ocamlcore.org/projects/zarith/

Enjoy,

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Val_int vs caml_copy_nativeint

2011-08-08 Thread Xavier Leroy
On 08/08/2011 10:03 AM, Guillaume Yziquel wrote:

> Then I do not see anything wrong if the code snippet you sent. However,
> when you change Val_int to caml_copy_nativeint, the layout of the tuple
> is different. [...] So if you keep the same OCaml code when reading
> the result value, it's no surprise that the pointer is shown in
> place of the integer you expected.

This is good advice indeed: make sure your Caml type declaration
matches the data representation that your Caml/C stub implements...

>/* Package up the result as a tuple. */
>v_response = caml_alloc_tuple (3) ;
>Store_field (v_response, 0, Val_int (width)) ;
>Store_field (v_response, 1, Val_int (height)) ;
>Store_field (v_response, 2, caml_copy_string (code)) ;
>CAMLreturn (v_response) ;

Additionally, do make sure that "v_response" is registered with the GC
(declared with one of the CAMLlocal macros).

If both conditions are met, your code should be all right.  If it
still misbehaves, feel free to post a repro case on the bug tracker
http://caml.inria.fr/mantis

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Re: Great Renaming

2011-07-29 Thread Xavier Leroy
On 07/29/2011 06:59 PM, Sylvain Le Gall wrote:

> Another side effect: caml.inria.fr points to active-dvi !

Unrelated, but thanks for reporting it.  It seems that INRIA's DNS is
broken and reports two different IP addresses for caml.inria.fr, the
correct one and the one of the server hosting advi.inria.fr...
I notified our network admins but we'll have to wait until Monday for
a fix.  Be patient...

Mehdi Doggy adds:

> Do you plan to keep caml.inria.fr as well? or to add ocaml.inria.fr?

We didn't discuss that.  For the moment, I'll wait until the
caml.inria.fr issue is resolved before asking our network admins
anything else :-)

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] The CAML Anthology

2011-07-28 Thread Xavier Leroy
On 07/22/2011 07:14 PM, Nicolas Ojeda Bar wrote:

> Does anyone know if it is possible to get a copy of the
> INRIA document
> 
> 'The CAML Anthology' (1987)?
> 
> I don't think it is published.

I asked Gérard Huet, who, the next day, produced his paper copy of
that document.  Unfortunately, the LaTeX sources appear to be lost.

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Hashtbl performance

2011-07-28 Thread Xavier Leroy
Fabrice Le Fessant :

> Thus, you should consider using your own hash function, probably only
> considering the ints in the list and its length. Then, use the functor
> in the Hashtbl module.

On 07/28/2011 02:25 PM, barn...@recherche.enac.fr wrote:

> You could also use the hash_param : int -> int -> 'a -> int
> function in the functor which allows to specify the number
> of nodes traversed to compute the key :

Both Fabrice's and Nicolas's suggestions are excellent.

Let me just add that this problem with lists as hashtable keys is one
of the known issues with OCaml's current generic hash function: the
other is that the mixing functions used are simplistic and exhibit
some statistical bias, even for strings.

The SVN trunk for OCaml contains a complete reimplementation of the
generic hash function that addresses both issues: lists and other
complex keys are traversed breadth-first in a more cautious manner
than before, and the mixing functions are based on MurmurHash 3, which
exhibits very good statistical properties.  All this should go in the
next major release 3.13.  So, everyone with an interest in efficient
hash tables is welcome to try the trunk sources and let me know of any
problems encountered.

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] Nested callbacks?

2011-07-05 Thread Xavier Leroy
On 07/05/2011 06:30 PM, Thomas Fischbacher wrote:
> 
> Dmitry Bely wrote:
> 
>> Is it allowed to call a Caml closure from C (caml_callbackN_exn), that
>> calls another Caml closure internally (also with caml_callbackN_exn)?
> 
> I strongly hope so! If this did not work, that would have disastrous
> consequences for the tight integration of Caml and Python which we are
> using. (Well, so far, we never encountered a problem with that. And the
> documentation does not warn against doing this - so, should it not work,
> that should be considered a bug.)

Indeed, it should work, and I see no reason why nested callbacks could
fail.  Callbacks do save some Caml-specific state and restore it
before returning, but they use the stack to do so, so they should be
reentrant.  Please file a bug report if you find out they are not.

- Xavier Leroy



-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs



Re: [Caml-list] OCaml 3.12.1 compatibility report

2011-06-27 Thread Xavier Leroy
2011/6/27 xclerc :

> The tests are failing because a try is made to compare two big arrays with 
> different layouts.
> It used to be accepted by the big array compare function, but now only big 
> arrays with the
> same kind and layout can be compared [1].

Technically, bigarrays that differ in kind or layout can still be
compared safely (= without crashing), but in 3.12.1 and up they will
never compare equal.

This is really a corner case, because in classic Caml and not using
Obj.repr nor Obj.magic, the two bigarrays being compared must have the
same static type and therefore the same kind and layout.  With
first-class modules today, or GADTs tomorrow, it is possible to
compare two values having different representation types.  That's why
polymorphic comparison in 3.12.1 was hardened so that it would behave
better in this case.

I'd be interested to understand why bin_prot ends up comparing
bigarrays of different layouts: is this an oversight in the test suite
or a strong requirement?

- Xavier Leroy

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs