Alrighty, I will try that later today and report back. I'm thinking maybe
that older image has a kernel with some crazy driver that's taking forever
to emulate per-CPU and that's what is causing it to boot for so long.

Thanks,
Paul

On Wed, Mar 13, 2013 at 2:52 PM, <[email protected]> wrote:

> I'm trying to reproduce your issue, but I'm not able to do so...
>
> I was able to boot this image with c=16:
> http://bertha.cs.binghamton.edu/downloads/ubuntu-natty.tar.bz2
>
> In fact, I didn't even need the rootdelay argument until I configured
> MARSS for >16 cores. Could you try that image, and maybe consider
> upgrading to it if it works for you?
>
> Tyler
>
> > I can see the rootdelay parameter did its thing because I see the kernel
> > saying "waiting 200sec before mounting root partition" or whatever. After
> > a
> > few minutes I get here:
> >
> >  * Filesystem type 'fusectl' is not supported. Skipping mount.
> >
> >  * Starting kernel event manager...
>  [
> > OK ]
> >  * Loading hardware drivers...
> >
> > input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2
> >
> > ACPI: Power Button [PWRF]
> >
> > processor LNXCPU:00: registered as cooling_device0
> >
> > processor LNXCPU:01: registered as cooling_device1
> >
> > processor LNXCPU:02: registered as cooling_device2
> >
> > processor LNXCPU:03: registered as cooling_device3
> >
> > processor LNXCPU:04: registered as cooling_device4
> >
> > processor LNXCPU:05: registered as cooling_device5
> >
> > processor LNXCPU:06: registered as cooling_device6
> >
> > processor LNXCPU:07: registered as cooling_device7
> >
> > processor LNXCPU:08: registered as cooling_device8
> >
> > processor LNXCPU:09: registered as cooling_device9
> >
> > processor LNXCPU:0a: registered as cooling_device10
> >
> > processor LNXCPU:0b: registered as cooling_device11
> >
> > processor LNXCPU:0c: registered as cooling_device12
> >
> > processor LNXCPU:0d: registered as cooling_device13
> >
> > processor LNXCPU:0e: registered as cooling_device14
> >
> > processor LNXCPU:0f: registered as cooling_device15
> >
> >
>  [
> > OK ]
> >
> > and thats' about where it gets really stuck for me. It just sits and
> waits
> > for a really long time
> >
> > On Wed, Mar 13, 2013 at 12:10 PM, Paul Rosenfeld
> > <[email protected]>wrote:
> >
> >> Just for reference, here's the menu.lst entry from my menu.lst:
> >>
> >> title           Ubuntu 9.04, kernel 2.6.31.4qemu
> >> uuid            ab838715-9cb7-4299-96f7-459437993bde
> >> kernel          /boot/vmlinuz-2.6.31.4qemu root=/dev/hda1 ro single 1
> >> rootdelay=200
> >>
> >> Everything look OK here?
> >>
> >>
> >> On Wed, Mar 13, 2013 at 11:49 AM, Paul Rosenfeld
> >> <[email protected]>wrote:
> >>
> >>> Nope, nothing fancy. A few commits behind HEAD on master with some
> >>> modifications to the simulation (but nothing changed in qemu). Added
> >>> rootdelay=200 and it hangs right after freeing kernel memory and takes
> >>> a
> >>> really long time to get the disks mounted and the devices loaded.
> >>>
> >>> I'll double check my image to make sure that the rootdelay made it into
> >>> the correct menu.lst entry.
> >>>
> >>> The odd thing is that once I get to the login prompt (if I'm doing an
> >>> interactive session), qemu is perfectly responsive. Maybe I'll try to
> >>> boot
> >>> a raw image instead of qcow2 to see if that changes anything.
> >>>
> >>>
> >>>
> >>> On Wed, Mar 13, 2013 at 11:43 AM, <[email protected]> wrote:
> >>>
> >>>> I forgot to add --
> >>>>
> >>>> The only issue that I can see with your approach is keeping qemu in
> >>>> sync
> >>>> with ptlsim. If you look at `ptl_add_phys_memory_mapping` in
> >>>> qemu/cputlb.c, you'll notice that qemu feeds page mappings to ptlsim
> >>>> even
> >>>> when ptlsim isn't active.
> >>>>
> >>>> I could be wrong here, but I believe you'll need to update that
> >>>> mapping
> >>>> once you boot a checkpoint.
> >>>>
> >>>> We'd be more than willing to help you in whatever way to can to get
> >>>> something like this committed to master.
> >>>>
> >>>> Tyler
> >>>>
> >>>> > Paul,
> >>>> >
> >>>> > Adding rootdelay to menu.lst is the same thing as passing it as a
> >>>> kernel
> >>>> > argument, so yes.... no difference.
> >>>> >
> >>>> > As Avadh mentioned, 5 hours is a _long_ time to get things going. I
> >>>> got a
> >>>> > 16+ core instance to get to a prompt in a few minutes last time I
> >>>> tried.
> >>>> > Admittedly, I never tried to create a checkpoint when I had that
> >>>> many
> >>>> > cores... is the checkpointing taking a long time, or are you waiting
> >>>> that
> >>>> > long just to boot the system?
> >>>> >
> >>>> > What value are you passing to rootdelay? You're building master
> >>>> without
> >>>> > debugging or anything fancy, right?
> >>>> >
> >>>> > Tyler
> >>>> >
> >>>> >> I added rootdelay to menu.lst (and I think in grub1 you don't have
> >>>> to
> >>>> do
> >>>> >> anything else, right?)
> >>>> >>
> >>>> >> I'm using the old parsec images with a reasonably ancient ubuntu.
> >>>> Should
> >>>> >> I
> >>>> >> be using something more recent?
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> On Wed, Mar 13, 2013 at 11:13 AM, avadh patel <[email protected]
> >
> >>>> >> wrote:
> >>>> >>
> >>>> >>>
> >>>> >>>
> >>>> >>>
> >>>> >>> On Tue, Mar 12, 2013 at 1:19 PM, Paul Rosenfeld
> >>>> >>> <[email protected]>wrote:
> >>>> >>>
> >>>> >>>> Hello Everyone,
> >>>> >>>>
> >>>> >>>> It's been a while but I'm starting to use MARSSx86 for
> >>>> simulations
> >>>> >>>> again.
> >>>> >>>> I've been trying to run 16 core simulations and am finding that
> >>>> the
> >>>> >>>> boot
> >>>> >>>> time is very long (~5 hours to make a checkpoint). This makes it
> >>>> quite
> >>>> >>>> frustrating when I accidentally set the wrong parameters inside
> >>>> the
> >>>> >>>> workload or run the wrong workload or any number of other
> >>>> mistakes I
> >>>> >>>> tend
> >>>> >>>> to make.
> >>>> >>>>
> >>>> >>>> Booting 16 core should not take that long.  Did you try adding
> >>>> >>> 'rootdelay' option to kernel command line? It significantly
> >>>> improves
> >>>> >>> kernel
> >>>> >>> boot time in QEMU for large number of cores.
> >>>> >>>
> >>>> >>> - Avadh
> >>>> >>>
> >>>> >>>
> >>>> >>>> So I was thinking -- what if I made a post boot but
> >>>> >>>> pre-simulation-switch
> >>>> >>>> checkpoint (i.e., checkpoint but stay in emulation mode). That
> >>>> way,
> >>>> >>>> the
> >>>> >>>> create_checkpoints.py script could just launch the system from
> >>>> the
> >>>> >>>> post-boot snapshot and proceed to launch the workloads which
> >>>> would
> >>>> >>>> have
> >>>> >>>> the
> >>>> >>>> PTL calls that would then make the actual simulation checkpoints.
> >>>> Not
> >>>> >>>> only
> >>>> >>>> would that reduce the time it took to create a lot of
> >>>> checkpoints,
> >>>> but
> >>>> >>>> also
> >>>> >>>> if I screwed up a checkpoint, I could just delete it, boot the
> >>>> >>>> post-boot
> >>>> >>>> snapshot, tweak the workload, and re-checkpoint the simulation.
> >>>> >>>>
> >>>> >>>> I think marss checkpoints piggyback on qemu's snapshot
> >>>> capabilities,
> >>>> >>>> but
> >>>> >>>> is there some downside to this approach here that I'm missing?
> >>>> >>>>
> >>>> >>>> Thanks,
> >>>> >>>> Paul
> >>>> >>>>
> >>>> >>>> _______________________________________________
> >>>> >>>> http://www.marss86.org
> >>>> >>>> Marss86-Devel mailing list
> >>>> >>>> [email protected]
> >>>> >>>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
> >>>> >>>>
> >>>> >>>>
> >>>> >>>
> >>>> >> _______________________________________________
> >>>> >> http://www.marss86.org
> >>>> >> Marss86-Devel mailing list
> >>>> >> [email protected]
> >>>> >> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
> >>>> >>
> >>>> >
> >>>> >
> >>>> >
> >>>> > _______________________________________________
> >>>> > http://www.marss86.org
> >>>> > Marss86-Devel mailing list
> >>>> > [email protected]
> >>>> > https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
> >>>> >
> >>>>
> >>>>
> >>>
> >>
> > _______________________________________________
> > http://www.marss86.org
> > Marss86-Devel mailing list
> > [email protected]
> > https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
> >
>
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to