I believe that the alignment needs to be 16 bytes, not just 8 bytes
for the MOVDQA instruction.
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Fri, Dec 11, 2009 at 9:31 AM, Cactus wrote:
>
>
> On Dec 11, 2:11 pm, Bill Hart wrote:
>> omalloc is easy to fix
Dan, thanks for this bug report. I could also see that it probably
took you many many tedious hours to find this bug. Amazing work!!
Thank you sooo much to taking the time to do it!!!
This is a really subtle bug. I just checked that malloc documentation
on several Linux distros, Mac OS X, BSD,
Okay, so ignore the last paragraph of my previous email :-)
Let me know when it's safe to take down the MPIR svn repo on
modular.math.jmu.edu
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Mon, Oct 19, 2009 at 8:59 AM, Bill Hart wrote:
>
> I've made a gi
> On Aug 12, 5:21 pm, jason wrote:
>> There is also the ansi to K&R conversion , no-one uses a K&R C
>> compiler nowadays ? , I never have and I starting using C in 94(16bit
>> DOS...yuck) , I propose we remove it .
>>
>> As only William has responded so far to this thread , I assume
>> everyone
No, there isn't. That's not due to any nefarious reasons, but simply
because I haven't figured out how to set it up!
--jason
p.s. If anyone else wants to, then I'm happy to hand over repo management :-)
On Fri, Jun 12, 2009 at 11:44 AM, David Harvey wrote:
>
> Hi,
>
> Is there any web interf
ation with /tmp/cleo/mpn/tmp-sub_n.s(118),p9
>> .libs/sub_n.o - 3 error(s), 0 warning(s)
>> make[2]: *** [sub_n.lo] Error 1
>>
>> icc -c99 -DHAVE_CONFIG_H -I. -I/home/jasonmoxham/mpir/mpir/trunk/mpn -I..
>> -D__GMP_WITHIN_GMP -I/home/jasonmoxham/mpir/mpir/trunk -
=64
> checking compiler icc -no-gcc ... no, program does not run
> configure: error: could not find a working compiler, see config.log for
> details
>
> I use the standard skynet_bash_profile so I assume the paths are correct and
> export CC=icc is how you use the intel compiler.
>
>
On Sun, May 17, 2009 at 3:34 PM, Jeff Gilchrist
wrote:
>
> On Sun, May 17, 2009 at 3:05 PM, Bill Hart
> wrote:
>
>> We intend to discuss:
>>
>> * CUDA
>> * Assembly language for NVIDIA cards
>
> Everything seems to be CUDA and NVIDIA these days. Is there no love
> for the also powerful ATI car
do for that. We can certainly revisit the decision
>> later, however for now there *is* a git repo.
>>
>> For those who wish to keep using svn, absolutely nothing has changed
>> for you guys. Keep committing and updating as usual. The git system is
>> totally ext
Hi Dave,
Welcome and Thanks!
If you're interested in using a distributed version control system,
could I convince you to use Mercurial instead of git? The
functionality is similar between the two of them, but since Mercurial
is written in Python, it is quite a bit more portable. I think that
T
On Sun, Mar 15, 2009 at 1:36 PM, Jason Moxham wrote:
>
> On Sunday 15 March 2009 17:29:30 Jason Martin wrote:
>> > On Sunday 15 March 2009 17:03:51 Jason Martin wrote:
>> >> Hi Guys,
>> >>
>> >> Sorry for the late reply, but I've been
> On Sunday 15 March 2009 17:03:51 Jason Martin wrote:
>> Hi Guys,
>>
>> Sorry for the late reply, but I've been camping for the last couple days...
>>
>> I believe that I can rewrite the core2 code to avoid the lahf/sahf
>> instructions without
Hi Guys,
Sorry for the late reply, but I've been camping for the last couple days...
I believe that I can rewrite the core2 code to avoid the lahf/sahf
instructions without any performance lost. If there is still an
interested or need, let me know and I'll have a go at it.
--jwm
On Sun, Mar
On Sat, Mar 7, 2009 at 8:34 PM, Bill Hart wrote:
>
> The problem on varro seems to be a screwed up gcc. This is what you
> get when you type gcc -v:
>
> varro:~/mpir-varro wbhart$ gcc -v
> Using built-in specs.
> Target: powerpc-apple-darwin8.11.0
> Configured with: /usr/local/gcc-4.3.3/src/gcc-4
On Thu, Mar 5, 2009 at 7:38 PM, wrote:
>
> On Thursday 05 March 2009 22:54:34 Jason Martin wrote:
>> > I've got a 4.4c/l , Agner Fog's says the thruput on 64 bit mul is 4c/l ,
>> > but in another section it says you can issue one every cycle , if the
>>
were ordering one of these?
>
> Bill.
>
> 2009/3/5 Jason Martin :
>>
>>> I've got a 4.4c/l , Agner Fog's says the thruput on 64 bit mul is 4c/l , but
>>> in another section it says you can issue one every cycle , if the latter
>>> section mea
> I've got a 4.4c/l , Agner Fog's says the thruput on 64 bit mul is 4c/l , but
> in another section it says you can issue one every cycle , if the latter
> section means 32bit then , 4c/l could be right ( I say could not would as
> consider the K8) , I expect I can do a bit better than 4.4c/l , bu
On Wed, Mar 4, 2009 at 7:19 AM, Jason Martin
wrote:
> Well, the long answer seems to be: Apple doesn't support linking
> against static versions of Apple supplied libraries. In other words,
> there is no static version of crt0.o
>
> I'm coming to this conclusion f
will be dynamically linked in.
For MPIR, this should mean that we can build the libraries statically,
but all the test and tuning code will need to be built without the
"-static" flag.
--jason
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Tue, Mar
I was recently running some timing test for GPU based computing on a
Macbook Pro with a Core 2 processor. I was amazed that the tiny
little GPU on my Macbook seemed to be outperforming the GPU in my
desktop. But, then I noticed that when I ran really long jobs, the
laptop was taking much longer
using -static. This is on martinj.
>>
>> Bill.
>>
>> 2009/3/3 Jason Martin :
>>>
>>> Are you building in shared (dynamically loadable) object mode or static?
>>> - Show quoted text -
>> - Show quoted text -
>>>
>>>
>&
Are you building in shared (dynamically loadable) object mode or static?
On Tue, Mar 3, 2009 at 12:49 PM, Bill Hart wrote:
>
> When I try to link against mpir on OSX (leopard?? - how would I even
> find that out) it complains:
>
> ld: library not found for -lcrt0.o
> collect2: ld returned 1 ex
I strongly favor Bill's suggestion!
On Tue, Mar 3, 2009 at 9:55 AM, Bill Hart wrote:
>
> I have a concern about mpn_addadd, mpn_addsub, mpn_mulredc, etc. I
> would very much like to change the names of these to mpir_n_addadd,
> mpir_n_addsub, mpir_n_mulredc, etc.
>
> The reason is that when GMP
I like Jason's idea.
I'm not sure if it would generate a huge improvement on K8-10 type
processors where the mul is already quite fast, but on the Intel
processors with their slow multiply instructions it seems like doing
"add submul" would be faster than "addadd mul" just because the
add/sub can
Sounds fine.
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Sun, Feb 22, 2009 at 2:49 PM, wrote:
>
>
> If there are no objections , I will merge the core-2 branch into trunk
> tomorrow. All I have changed are inc/dec to add/sub .
> I didn't do it for mpn
On Fri, Feb 20, 2009 at 3:18 PM, wrote:
>
> On Friday 20 February 2009 16:42:19 Jason Martin wrote:
>> > On Friday 20 February 2009 14:12:19 ja...@njkfrudils.plus.com wrote:
>> >
>> > What happened to core-2 mul_basecase and sqr_basecase ? , no-wonder
>> &
> On Friday 20 February 2009 14:12:19 ja...@njkfrudils.plus.com wrote:
>
> What happened to core-2 mul_basecase and sqr_basecase ? , no-wonder core-2
> benchmarks are crap
There aren't any :-) I was just using Gaudry's code for those routines.
Should be able to use your amd64 code for those eve
On Thu, Feb 19, 2009 at 6:41 AM, Cactus wrote:
> I had an interesting email exchange with Agner Fog about this and he
> argues that it is not unreasonable to ignore the need for exception
> support in low level code since very few exceptions can occur anyway
> and those that can are not likely to
On Wed, Feb 18, 2009 at 8:05 PM, wrote:
>
> On Thursday 19 February 2009 00:49:16 Jason Martin wrote:
>> On Wed, Feb 18, 2009 at 7:13 PM, wrote:
>> > On Wednesday 18 February 2009 22:03:43 Mariah wrote:
>> >> gmp-4.2.4 mpir-0.9.0
>> >>
>&
On Wed, Feb 18, 2009 at 7:51 PM, William Stein wrote:
>
> On Wed, Feb 18, 2009 at 4:49 PM, Jason Martin
> wrote:
>>
>> On Wed, Feb 18, 2009 at 7:13 PM, wrote:
>>>
>>> On Wednesday 18 February 2009 22:03:43 Mariah wrote:
>>>> gmp-4.2.4 m
On Wed, Feb 18, 2009 at 7:13 PM, wrote:
>
> On Wednesday 18 February 2009 22:03:43 Mariah wrote:
>> gmp-4.2.4 mpir-0.9.0
>>
>> 2241.9 2251 cicero (pentium4-pc-linux-gnu)
>> 3371.5 3369.3 cleo (ia64-unknown-linux-gnu)
>> 6024.5 7437.8 eno (core2-unknown-linux-gn
No, we can't guarantee minimality. The xgcd that you're seeing right
now in MPIR only uses Moller's code when it calls to gcd. It isn't as
nice as the xgcd that is in GMP 4.3. Hopefully, we will be able to
re-invent that wheel since I couldn't find a LGPL v. 2+ version of
Moller's xgcd code.
C
Looks great Bill. Nice work. My only comment is that you might want
to explicitly put the version number for GMP and the LGPL from which
MPIR is derived... just to avoid any confusion.
--jason
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Sun, Jan 11,
This wasn't clear to me.
>From the structure of the Moller LGPL v2.1 patches, it looks to me
like these were "proof of concept" code. I suspect that he cleaned
the code up a lot for the GMP 4.3 contribution, but I haven't looked
at it because I didn't want to "taint" the v2.1 version with a
poss
02.html
>
> Bill.
>
> 2009/1/10 Jason Martin :
>>
>> eno is also using svn version 1.4.6, so that probably isn't the issue.
>>
>> I'm not very svn savvy, so if their are some "post commit hooks" or
>> things like that which should
at behavior, if so let me know
what it is.
--jason
On Fri, Jan 9, 2009 at 10:30 PM, Bill Hart wrote:
>
> I mean I checked out on eno then switched to fulvia.
>
> 2009/1/10 Jason Martin :
>>
>> The SVN running on the server is version 1.4.6
>>
>> The most
Actually, while we're on the topic of repository servers, it might be
a good idea if someone else setup an additional Web_DAV server so that
we could use svnsync to maintain mirrored repositories.
--jason
On Fri, Jan 9, 2009 at 9:47 PM, Jason Martin
wrote:
> The SVN running on the s
The SVN running on the server is version 1.4.6
The most current SVN version is 1.5.5, so "no" we aren't using the
most current version. We are using the version in the standard Ubuntu
distro.
I just looked at the subversion web page, and the native file
formatting stuff isn't mentioned in the d
;
> Did you scroll down the page? There are/were quite a few blockers!!
>
> If you open any particular ticket, then in the upper left it tells you
> somewhere that it is a blocker, critical, major, minor, etc.
>
> Tickets are also all ordered by severity. The top ones are all block
Hey Bill,
When I look at the trac server, everything is the same color. I don't
know if that's a browser issue or what, but could things that are
"blockers" just be labeled as such? Or, if that's too much effort,
could you just indicate which labels mean something is a blocker?
Thanks,
jason
I'd like to get rid of NAILS support. It adds a huge amount of
complication, and I haven't seen it actually pan out for any modern
processor.
I vote for leaving the existing NAILS code alone, but explicitly
stating that NAILS is not supported. Perhaps, if we do discover some
use for it, it migh
Excellent. I'm about to leave for the MAA-AMS Joint Meetings, so I
won't get a chance to bench your K10 stuff on a Dunnington until I get
back. I'll report on what speeds I get.
--jason
On Sun, Jan 4, 2009 at 5:44 PM, wrote:
>
> On Sunday 04 January 2009 01:57:05 Jason
Excellent!! Thanks for the error checking Michael!!
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Sat, Jan 3, 2009 at 8:56 PM, mabshoff
wrote:
>
>
>
> On Jan 3, 5:20 pm, mabshoff dortmund.de> wrote:
>> Hi,
>
>
>
>> I am now running some larger example
n Sunday 04 January 2009 00:57:44 ja...@njkfrudils.plus.com wrote:
>> On Sunday 04 January 2009 00:36:46 Jason Martin wrote:
>> > Alternatively, we could stop trying to identify chips by marketing
>> > brands and just use the values returned by CPUID. This would create a
Alternatively, we could stop trying to identify chips by marketing
brands and just use the values returned by CPUID. This would create a
lot of duplicated code in sub-directories, but disk space is cheap.
So, would something like:
mpn/x86_64//
work for our configuration?
Jason Worth Martin
Ass
Ah. That makes sense, and it seems like a nice way to exploit the
instruction level parallelism. Sounds like a good way to go.
--jason
On Thu, Jan 1, 2009 at 4:21 PM, wrote:
>
> On Thursday 01 January 2009 21:11:38 Jason Martin wrote:
>> Pardon my daftness, but what exactly do
Pardon my daftness, but what exactly do you mean by a bidirectional divexact?
--jason
On Thu, Jan 1, 2009 at 4:05 PM, Bill Hart wrote:
>
> Number two would be very interesting. I've thought about implementing
> such a thing elsewhere for polynomials, but it also makes sense for
> MPIR.
>
> Bill
Hi,
At the moment there isn't a tarball for MPIR as it is not yet stable
(although it is very close to being so).
You should be able to do a read-only checkout of the SVN code using the URL
http://modular.math.jmu.edu/svn/mpir
with an svn command (in Linux) similar to this:
svn co http://modu
I've fixed the problems with "make tune" and "make speed" with the
Moller patches.
I haven't fixed the nails problem yet, or some of the build warnings,
but I'll continue trying to wade through the build scripts to see
where I need to tweak things.
In the mean time, please check out the latest v
-tdiv_ui
> PASS: t-fdiv
> PASS: t-fdiv_ui
> PASS: t-cdiv_ui
> nhgcd2.c:206: GNU MP assertion failed: h0 == h1
> /bin/sh: line 4: 31599 Aborted ${dir}$tst
> FAIL: t-gcd
> PASS: t-gcd_ui
> nhgcd2.c:206: GNU MP assertion failed: h0 == h1
> /bin/sh:
Attached are some edited versions of
mpn/generic/gcd.c
and
mpn/generic/ngcd.c
Drop them in, test them for correctness and speed. Let me know what
breaks. When everyone is happy, I'll check them in to svn
--jason
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~mart
ooks right:
>
> http://sage.math.washington.edu/home/wbhart/flint-trunk/graphing/polymul3.png
>
> This was a stupid mistake on my part. However it did allow me to
> notice that sage.math is not being detected, so it had some use.
>
> Bill.
>
> 2008/12/22 Jason Martin :
>&
Here's some interesting timing results for the Moller code. Note that
after roughly 16,000 decimal digits the bgcd algorithm becomes the
fastest.
approx 0256 digits
gcd 57488.00 cycles
rgcd57160.00 cycles
bgcd
Well, I just got a new machine with Xeon E7450 processors in it. The
7400 series are all using the Dunnington architecture, so I'll start
looking into it as soon as I get the machine stable.
Agner Fog doesn't have these new cores in his guides yet... probably
he'll have them soon.
Jason Worth M
bsite and look at the
> asymptotics. That has nothing to do with Magma.
>
> Bill.
>
> 2008/12/21 Jason Martin :
>>
>>> By the way, the code you posted has a variable missing in the final
>>> printf. It always returns 0.00.
>>
>> Sorry, that was a version cont
> By the way, the code you posted has a variable missing in the final
> printf. It always returns 0.00.
Sorry, that was a version control issue. The numbers I posted below
are for correct code (with the "result" variable printed.
I don't get the error message that you are reporting when buildin
I've been running speed some tests with the Moller patches, and it
looks to me like they work just fine for larger numbers, but at
smaller limb counts they are slower than the original code. I've
attached my test code so that you can see what I'm doing.
I suspect that I need to add some tuning to
I'm working on that right now :-)
On Fri, Dec 19, 2008 at 10:56 PM, Bill Hart wrote:
>
> OK, I see there is now a K8-experimental, so I'll hold off on making
> another copy for now. Once Jason Martin has finished with the GCD
> patches I/some
> Hi,
>
>> > Yikes... high traffic, I can barely keep up!
>
> Well, high traffic is a good thing IMHO :)
>
>> > RE: Moller patches
>>
>> > On hold until Final Exams are done
>
> Ok, got a date for that since I don't know your schedule?
Dec. 20th. If I haven't posted something about it by then, p
Yikes... high traffic, I can barely keep up!
RE: Moller patches
On hold until Final Exams are done
RE: OSX 64-bit
Michael, does this mean that all the Python libraries are building on
OSX in 64-bit mode now? Last time I tried playing with 64-bit Sage
builds on OSX, I couldn't get all the Pyt
I've seen the same type of problem with Core 2, but at different limb
counts... and it was different on Conroe versus Woodcrest, which is
wacky because they are supposed to be that "same" architecture. I've
got a Dunnington machine coming in next week, so I'm curious to see
how its performance di
Hi All,
After much fighting with Apache, I think that I might possibly have a
somewhat correct webDAV based MPIR SVN repository setup (how about
them caveats?). The new location for the MPIR repository is located
at:
http://modular.math.jmu.edu/svn/mpir
You can browse the source via a web-brow
e latter is that we
>> need 4 mul's in the loop which already ties up alu0 for 8 of the 10
>> cycles allowed in the loop. Therefore one needs to schedule everything
>> carefully so that nothing much else runs in ALU0. That requires
>> knowledge of the pick
However, given a 32 bit quantity in the lower 32 bits of a 64 bit
> word, shift right by n can be simulated by shift left by 32 - n.
>
> Bill.
>
> 2008/11/23 Jason Martin <[EMAIL PROTECTED]>:
>>
>> So, as I look over the CUDA specification I don't see support
> You assume OOO works perfectly.
>
> mov $0,%r11
>mul %rcx
>add %rax,%r10
>mov 24(%rsi,%rbx,8),%rax
>adc %rdx,%r11
>mov %r10,16(%rdi,%rbx,8)
>mul %rcx
> heremov $0,%r8
>add %rax,%r11
>mov 32(%rsi,%rbx,8),%rax
>adc
So, as I look over the CUDA specification I don't see support for some
important integer operations like: shift, rot, mul, and div. I
suppose that left shift could be implemented by repeated adds, but I
can't see an easy way to implement right shift (if I'm missing
something, or if rot is a simpl
This looks very worthwhile! Perhaps, as a first "proof of concept,"
just request access to the cudacluster that nvidia is evidently
running. Then, if it looks like some high-performance multi-precision
arithmetic could actually be done using the cuda standard, submit a
larger request.
I'll go r
; Sorry about that Jason. We are currently in the process of moving the
> svn repo and it is pretty poor of us to not update the wiki.
>
> The new svn is
>
> svn+ssh://[EMAIL PROTECTED]/home/martin/mpir-svn-repo/mpir
>
> However you will probably need a username to check it out
Hi Jason,
Thanks for pointing out your work! We're always looking for faster
and faster bits of code!!
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Sat, Nov 15, 2008 at 12:29 PM, jason <[EMAIL PROTECTED]> wrote:
>
> I've done a bit of work on gmp , ava
of Mathematics
http://www.math.jmu.edu/~martin
On Thu, Nov 6, 2008 at 2:08 PM, Cactus <[EMAIL PROTECTED]> wrote:
>
>
>
> On Nov 6, 5:28 pm, "Jason Martin" <[EMAIL PROTECTED]>
> wrote:
>> Now you're starting to make me nervous that there is a repository
>
6, 2008 at 12:46 PM, William Stein <[EMAIL PROTECTED]> wrote:
>
> On Thu, Nov 6, 2008 at 9:41 AM, Cactus <[EMAIL PROTECTED]> wrote:
>>
>>
>>
>> On Nov 6, 5:28 pm, "Jason Martin" <[EMAIL PROTECTED]>
>> wrote:
>>> Now you'r
e:
>
> On Thu, Nov 6, 2008 at 9:28 AM, Jason Martin
> <[EMAIL PROTECTED]> wrote:
>>
>> Now you're starting to make me nervous that there is a repository
>> problem!! I spent many hours merging Moller's gmp_h.in into ours. I
>> also replaced th
Now you're starting to make me nervous that there is a repository
problem!! I spent many hours merging Moller's gmp_h.in into ours. I
also replaced the GMP gcd routine with Moller's so his routines should
be getting called.
Anyway, get SVN back up and I'll be happy to dig in and see what's
goin
Also, is SVN back up yet?
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Wed, Nov 5, 2008 at 11:56 AM, Bill Hart <[EMAIL PROTECTED]> wrote:
>
> Ok I'll give it a go. Perhaps it'll turn up something stupid that I did.
>
> Bill
08/11/1 Bill Hart <[EMAIL PROTECTED]>:
>>> Sorry, I simply mean you commit to the repo. I agree Mollers algorithm
>>> is clever and it should be pointed out that his paper on the topic is
>>> new work. He didn't just implement the half-BCD algorithm.
>>>
warning: implicit declaration of function 'mpn_bgcd'
> gcd.c: In function 'mpz_sgcd':
> gcd.c:175: warning: implicit declaration of function 'mpn_sgcd'
> gcd.c: In function 'mpz_ngcd':
> gcd.c:179: warning: implicit declaration of function '
Anyone want this one? If not, I'll go fix it.
Jason Worth Martin
Asst. Professor of Mathematics
http://www.math.jmu.edu/~martin
On Fri, Oct 31, 2008 at 9:14 PM, mabshoff
<[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> the following popped up on gmp-bugs. The fix is correct and was
> proposed by Mickae
On Fri, Oct 31, 2008 at 9:11 PM, mabshoff
<[EMAIL PROTECTED]> wrote:
> [snip]
> For me the highest priority item is the Moller's gcd code which has
> already been merged. What is the status of the code, i.e. performance,
> any known bugs, etc?
>
> Cheers,
>
> Michael
Hi Michael,
I haven't tested
Hi All,
Well, I'm now comfortable with writing 64-bit assembly code for
Windows (a side effect of NIST specifying Windows Vista as the
benchmark platform for the hashing competition). So, *if* there is
sufficient interest, I'd be willing to look at merging the separate
Windows vs. *nix x86_64 co
Hi All,
I've committed some pretty big changes to the trunk. To get Moller's
gcd code incorporated into the build tree I needed to modify a lot of
the aclocal, autoconf, and automake scripts. So, make sure you do a
"make distclean" before your next svn update. Seems to pass make
check for me o
80 matches
Mail list logo