[kvm-devel] Анализ договоров в условных ед иницах

2008-02-15 Thread Селиван
Новое в бухгалтерском и налоговом учете в 2008 году.

Однодневный семинар / 29 февраля 2008 г. / Москва

1. Анализ и сравнение ПБУ 3/2006 и ПБУ 3/2000 Учет активов и обязательств, 
стоимость которых выражена в иностранной валюте. 

2. Анализ договоров в условных единицах : 
-- цена товара (работ, услуг) определена на дату оплаты 
-- цена товара (работ, услуг) определена на дату отгрузки.
Проблема предоплат. Позиция Минфина. Числовые примеры. 

3. Бухгалтерский и налоговый учет продавца по договорам, заключенным в условных 
единицах с 2007 года и сравнение с 2006 годом. 
Числовые примеры Минфина и проблемная позиция Минфина относительно отражения в 
налоговом учете суммовых разниц без НДС. 
Отношение чиновников к выставлению счетов -фактур в условных единицах . 
Письма Минфина и ФНС.
Новые требования к заполнению формы 2 Отчет о прибылях и убытках. 

4. Новое в оценке первоначальной стоимости внеоборотных активов (основных 
средств, нематериальных активов), сырья (материалов), товаров, работ, услуг, 
приобретенных покупателем за рубли эквивалентно сумме в иностранной валюте с 
2007 года. 
Практические ситуации. Числовые примеры. Письма Минфина. 

5. Проблемы отражения хозяйственных операций по посредническим договорам, 
договорам аренды, лизинга, займа, заключенным в условных единицах и оплаченным 
в рублях. Примеры. Письма Минфина. 

6. Сложные моменты применения ПБУ 18/02 при отражении в бух. учете договоров, 
заключенных в у.е. Противоречивая позиция Минфина. Числовые примеры. 

7. Письма Минфина РФ и ФНС России. 

8. Ответы на вопросы участников семинара.

Стоимость обучения: 4900 руб. (с НДС). 
(В стоимость входит: раздаточный материал, кофе-пауза, обед в ресторане).

по телефону: (Ч9.5) 5Ч3_88..Ч6






-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [kvm-ppc-devel] KVM's signal masking

2008-02-15 Thread Avi Kivity
Hollis Blanchard wrote:
 We're having a hard time tracking down a PowerPC bug that seems to be
 related to KVM's signal handling (SIGALRM in particular), so we're
 trying to understand the overall signal handling design.

 It looks like the run sequence goes something like this:
  1. qemu: block SIGALRM (and a couple others)
  2. qemu: call kvm_run
  3. kvm: unblocks SIGALRM
  4. kvm: executes guest
  5. kvm: exit handler checks signal_pending(); if true returns to
 qemu
  6. kvm: re-blocks SIGALRM and returns to qemu
  7. qemu: kvm_eat_signals() synchronously calls the normal handlers
 for blocked signals

   

Yes.

 I'm confused about a few things. First, why must qemu unblock these
 signals? AFAICS signal_pending() still returns true regardless of the
 process's signal mask.
   

You mean kvm unblocks.  If the signals are blocked, the kernel will not 
wake up a sleeping process (or IPI a running one), resulting in 
unbounded latency.

 Second, why are we synchronously calling the signal handlers in the
 first place? Why not allow the signals simply to be delivered?
   

Async signal delivery is slow and racy (can happen between any two 
instructions, without locking).

Ideally, we wouldn't dispatch the signals at all; dequeing a signal 
means go check if something happened via select() or aio completion.  
I hadn't checked that all signal handlers are safe to omit, hence the call.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] enable gfxboot on VMX

2008-02-15 Thread Anthony Liguori
Alexander Graf wrote:
 Hi,

 this issue has already been talked about previously. Gfxboot on VMX is
 broken, because it reads SS after switching from real to protected mode,
 where SS contains an invalid value, which VMX does not allow.
 As far as I know, gfxboot is the only application that suffers from this
 issue.
 The current fix is to make gfxboot use a previously stored SS value,
 which works fine for new releases. Already shipped versions of the
 software can not be changed though, so there needs to be another way to
 make kvm work with older versions of gfxboot.

 As everything except gfxboot works, we can simply change gfxboot in
 runtime to use a different value. Unfortunately the mov instruction,
 used to read the SS register is only 2 bytes long, so there is no way to
 binary patch the mov to something that would contain an address. So the
 only way I could think of was an invalid instruction. The UD exception
 is intercepted in KVM and is already emulated for VMCALLs. This can be
 extended to an opcode, that is officially unused (0f 0c) and have the
 emulator do a mov realmode_ss, %eax.

 This patch implements exactly this idea and fixes openSUSE  11.0 and
 Ubuntu CD booting on VMX for me. Comments are, as always, welcome.
   

Have you tried SLES-9 or openSUSE variants of the same age?  The ss 
issue in gfxboot is only something recently introduced.  Prior to that, 
gfxboot used big real mode so your patch wouldn't be sufficient for 
those versions of gfxboot.

One thing I've thought about is converting gfxboot-disable[1] to 
generate a qcow2 that backs to the actual CDROM ISO.  Then in QEMU we 
could take an MD5 of an ISO if trying to boot from it, compare it to a 
white list of known bad CDs, and then generate a qcow2 automatically 
with gfxboot disabled.  When we eventually support big real mode in the 
kernel, we can disable this.

[1] http://hg.codemonkey.ws/gfxboot-disable

Regards,

Anthony Liguori


 Signed-off-by: Alexander Graf [EMAIL PROTECTED]


   
 

 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 

 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [FW: Announcing oVirt: web based VM management platform]

2008-02-15 Thread Daniel P. Berrange
The announcement below may well be of interest to people involved
in KVM. oVirt is using libvirt as its mnagement API, and the current
builds use Fedora 9 + KVM to get a cutting edge virtualization 
platform / technology in combination with cutting edge Linux kernels :-)
If anyone's interested, there's mailing list / IRC details below...

Regards,
Dan.

- Forwarded message from Hugh O. Brock [EMAIL PROTECTED] -

 Date: Thu, 14 Feb 2008 17:04:56 -0500
 From: Hugh O. Brock [EMAIL PROTECTED]
 To: [EMAIL PROTECTED], [EMAIL PROTECTED]
 Subject: [et-mgmt-tools] Announcing oVirt
 
 Announcing oVirt
 
 
 It is my pleasure to announce oVirt, the next step in open virtual
 machine management.
 
 oVirt is:
 
 * A small OS image that runs libvirt and hosts virtual machines 
 * A Web-based virtual machine management console
 
 oVirt goals:
 
 * Empower virtual machine owners without giving up control of
   hardware
 * Automate virtual machine clustering, load balancing, and SLA
   maintenance
 * Simplify management of large numbers of machines
 * Work across platforms and architectures
 
 oVirt uses:
 
 * A kerberos/LDAP server for authentication and authorization
   (oVirt ships with FreeIPA)
 * DNS/DHCP services on the local LAN -- or provides them for oVirt
   hosts over a private network if desired
 * Libvirt for virtual machine management, storage management, and
   secure remote communication
 * collectd for stats gathering and monitoring
 * Rails for rapid, flexible development
 
 oVirt mailing list: http://www.redhat.com/mailman/listinfo/ovirt-devel
 oVirt IRC: irc.freenode.net/#ovirt
 oVirt website: http://ovirt.org
 
 We encourage anyone interested to download the source (git clone
 git://git.et.redhat.com/ovirt) or the prebuilt appliance
 (http://ovirt.org/download). Let us know what you think!
 
 Enjoy,
 --Hugh Brock
 [EMAIL PROTECTED]
 
 ___
 et-mgmt-tools mailing list
 [EMAIL PROTECTED]
 https://www.redhat.com/mailman/listinfo/et-mgmt-tools
 
- End forwarded message -

-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-   Perl modules: http://search.cpan.org/~danberr/  -=|
|=-   Projects: http://freshmeat.net/~danielpb/   -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] enable gfxboot on VMX

2008-02-15 Thread Steffen Winterfeldt
On Fri, 15 Feb 2008, Alexander Graf wrote:

 On Feb 15, 2008, at 3:56 PM, Anthony Liguori wrote:
 
 Have you tried SLES-9 or openSUSE variants of the same age?  The ss issue in
 gfxboot is only something recently introduced.  Prior to that, gfxboot used
 big real mode so your patch wouldn't be sufficient for those versions of
 gfxboot.
 
 SLES7 - SLES-9  and SUSE 9.1 through to openSUSE 10.1 do not need the patch.
 They work 'as is'. SLES10 starts in text mode.
 Starting with 10.2 the mov ss issue came along, but maybe Steffen can tell us
 more about the history of this issue.

The use of memory  1MB was optional in older versions, so they might work
even if the pm switch doesn't work. sles10 has a special check so it doesn't
run in xen; maybe that gets in the way here, too. After sles10 big segments
in real mode are no longer used.


Steffen

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] fastcall removal

2008-02-15 Thread Andrea Arcangeli
This allows compiling the external module against linux.git (fastcall
has finally become the default and only choice).

Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]

diff --git a/kernel/external-module-compat.h b/kernel/external-module-compat.h
index 5611c12..52b745c 100644
--- a/kernel/external-module-compat.h
+++ b/kernel/external-module-compat.h
@@ -686,3 +686,9 @@ static inline struct page *__kvm_vm_fault(struct 
vm_area_struct *vma,
 
 #endif
 
+#if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,25)
+#ifndef FASTCALL
+#define FASTCALL(x)x
+#define fastcall
+#endif
+#endif

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ofa-general] Re: Demand paging for memory regions

2008-02-15 Thread Caitlin Bestler


 -Original Message-
 From: Christoph Lameter [mailto:[EMAIL PROTECTED]
 Sent: Friday, February 15, 2008 10:46 AM
 To: Caitlin Bestler
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED];
 [EMAIL PROTECTED]; kvm-devel@lists.sourceforge.net
 Subject: RE: [ofa-general] Re: Demand paging for memory regions
 
 On Fri, 15 Feb 2008, Caitlin Bestler wrote:
 
   What does it mean that the application layer has to be determine
 what
   pages are registered? The application does not know which of its
  pages
   are currently in memory. It can only force these pages to stay in
   memory if their are mlocked.
  
 
  An application that advertises an RDMA accessible buffer
  to a remote peer *does* have to know that its pages *are*
  currently in memory.
 
 Ok that would mean it needs to inform the VM of that issue by mlocking
 these pages.
 
  But the more fundamental issue is recognizing that applications
  that use direct interfaces need to know that buffers that they
  enable truly have committed resources. They need a way to
  ask for twenty *real* pages, not twenty pages of address
  space. And they need to do it in a way that allows memory
  to be rearranged or even migrated with them to a new host.
 
 mlock will force the pages to stay in memory without requiring the OS
 to keep them where they are.

So that would mean that mlock is used by the application before it 
registers memory for direct access, and then it is up to the RDMA
layer and the OS to negotiate actual pinning of the addresses for
whatever duration is required.

There is no *protocol* barrier to replacing pages within a Memory
Region as long as it is done in a way that keeps the content of
those page coherent. But existing devices have their own ideas
on how this is done and existing devices are notoriously poor at
learning new tricks.

Merely mlocking pages deals with the end-to-end RDMA semantics.
What still needs to be addressed is how a fastpath interface
would dynamically pin and unpin. Yielding pins for short-term
suspensions (and flushing cached translations) deals with the
rest. Understanding the range of support that existing devices
could provide with software updates would be the next step if
you wanted to pursue this.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Query on migration

2008-02-15 Thread Balaji Rao
Hello all, 

When going through orpofile code, I noticed that boot_cpu_data was used to 
determine CPU vendor, family etc. Will this be updated on migration to a 
different Machine say, from Intel to AMD ? If not, wouldn't it cause problems ?

Please clarify.
-- 
regards,
balaji rao

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ofa-general] Re: Demand paging for memory regions

2008-02-15 Thread Christoph Lameter
On Fri, 15 Feb 2008, Caitlin Bestler wrote:

 So that would mean that mlock is used by the application before it 
 registers memory for direct access, and then it is up to the RDMA
 layer and the OS to negotiate actual pinning of the addresses for
 whatever duration is required.

Right.
 
 There is no *protocol* barrier to replacing pages within a Memory
 Region as long as it is done in a way that keeps the content of
 those page coherent. But existing devices have their own ideas
 on how this is done and existing devices are notoriously poor at
 learning new tricks.

H.. Okay. But that is mainly a device driver maintenance issue.

 Merely mlocking pages deals with the end-to-end RDMA semantics.
 What still needs to be addressed is how a fastpath interface
 would dynamically pin and unpin. Yielding pins for short-term
 suspensions (and flushing cached translations) deals with the
 rest. Understanding the range of support that existing devices
 could provide with software updates would be the next step if
 you wanted to pursue this.

That is addressed on the VM level by the mmu_notifier which started this 
whole thread. The RDMA layers need to subscribe to this notifier and then 
do whatever the hardware requires to unpin and pin memory. I can only go 
as far as dealing with the VM layer. If you have any issues there I'd be 
glad to help.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 2/2] [PATCH] kvmclock implementation, the guest part.

2008-02-15 Thread Glauber de Oliveira Costa
This is the guest part of kvm clock implementation
It does not do tsc-only timing, as tsc can have deltas
between cpus, and it did not seem worthy to me to keep
adjusting them.

We do use it, however, for fine-grained adjustment.

Other than that, time comes from the host.

Signed-off-by: Glauber de Oliveira Costa [EMAIL PROTECTED]
---
 arch/x86/Kconfig   |   10 +++
 arch/x86/kernel/Makefile   |1 
 arch/x86/kernel/kvmclock.c |  161 
 arch/x86/kernel/setup_32.c |5 +
 arch/x86/kernel/setup_64.c |5 +
 5 files changed, 182 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3be2305..cc2bc37 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -372,6 +372,16 @@ config VMI
  at the moment), by linking the kernel to a GPL-ed ROM module
  provided by the hypervisor.
 
+config KVM_CLOCK
+   bool KVM paravirtualized clock
+   select PARAVIRT
+   help
+ Turning on this option will allow you to run a paravirtualized clock
+ when running over the KVM hypervisor. Instead of relying on a PIT
+ (or probably other) emulation by the underlying device model, the host
+ provides the guest with timing infrastructure such as time of day, and
+ system time
+
 source arch/x86/lguest/Kconfig
 
 config PARAVIRT
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 76ec0f8..5b91b82 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -69,6 +69,7 @@ obj-$(CONFIG_DEBUG_RODATA_TEST)   += test_
 obj-$(CONFIG_DEBUG_NX_TEST)+= test_nx.o
 
 obj-$(CONFIG_VMI)  += vmi_32.o vmiclock_32.o
+obj-$(CONFIG_KVM_CLOCK)+= kvmclock.o
 obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o
 
 ifdef CONFIG_INPUT_PCSPKR
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
new file mode 100644
index 000..b8da3bf
--- /dev/null
+++ b/arch/x86/kernel/kvmclock.c
@@ -0,0 +1,161 @@
+/*  KVM paravirtual clock driver. A clocksource implementation
+Copyright (C) 2008 Glauber de Oliveira Costa, Red Hat Inc.
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+*/
+
+#include linux/clocksource.h
+#include linux/kvm_para.h
+#include asm/arch_hooks.h
+#include asm/msr.h
+#include xen/interface/xen.h
+
+#define KVM_SCALE 22
+
+static int kvmclock = 1;
+
+static int parse_no_kvmclock(char *arg)
+{
+   kvmclock = 0;
+   return 0;
+}
+early_param(no-kvmclock, parse_no_kvmclock);
+
+struct shared_info shared_info __attribute__((__aligned__(PAGE_SIZE)));
+
+/* The hypervisor will put information about time periodically here */
+static struct kvm_vcpu_time_info hv_clock[NR_CPUS];
+#define get_clock(cpu, field) hv_clock[cpu].field
+
+static inline u64 kvm_get_delta(u64 last_tsc)
+{
+   int cpu = smp_processor_id();
+   u64 delta = native_read_tsc() - last_tsc;
+   return (delta * get_clock(cpu, tsc_to_system_mul))  KVM_SCALE;
+}
+
+static struct kvm_wall_clock wall_clock;
+static cycle_t kvm_clock_read(void);
+/*
+ * The wallclock is the time of day when we booted. Since then, some time may
+ * have elapsed since the hypervisor wrote the data. So we try to account for
+ * that with system time
+ */
+unsigned long kvm_get_wallclock(void)
+{
+   u32 wc_sec, wc_nsec;
+   u64 delta;
+   struct timespec ts;
+   int version, nsec;
+   int low, high;
+
+   low = (int)__pa(wall_clock);
+   high = ((u64)__pa(wall_clock)  32);
+
+   delta = kvm_clock_read();
+
+   native_write_msr(MSR_KVM_WALL_CLOCK, low, high);
+   do {
+   version = wall_clock.wc_version;
+   rmb();
+   wc_sec = wall_clock.wc_sec;
+   wc_nsec = wall_clock.wc_nsec;
+   rmb();
+   } while ((wall_clock.wc_version != version) || (version  1));
+
+   delta = kvm_clock_read() - delta;
+   delta += wc_nsec;
+   nsec = do_div(delta, NSEC_PER_SEC);
+   set_normalized_timespec(ts, wc_sec + delta, nsec);
+   /*
+* Of all mechanisms of time adjustment I've tested, this one
+* was the champion!
+*/
+   return ts.tv_sec + 1;
+}
+
+int kvm_set_wallclock(unsigned long now)
+{
+   return 0;
+}
+
+/*
+ * This is our 

Re: [kvm-devel] Query on migration

2008-02-15 Thread Dor Laor

On Fri, 2008-02-15 at 23:42 +0530, Balaji Rao wrote:
 Hello all, 
 
 When going through orpofile code, I noticed that boot_cpu_data was used to 
 determine CPU vendor, family etc. Will this be updated on migration to a 
 different Machine say, from Intel to AMD ? If not, wouldn't it cause problems 
 ?
 
 Please clarify.

Right now it is a problem and the cpu vendor will be different on the
destination. Guest application might notice it and brake and even more
likely that instructions like sysenter/exit are not the same on the
target cpu.
Luckily there is a patch by Amit Shah for emulating them and it and the
option of virtualizing the cpu parameters will be committed soon.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ofa-general] Re: Demand paging for memory regions

2008-02-15 Thread Caitlin Bestler
Christoph Lameter asked:
 
 What does it mean that the application layer has to be determine what
 pages are registered? The application does not know which of its
pages
 are currently in memory. It can only force these pages to stay in
 memory if their are mlocked.
 

An application that advertises an RDMA accessible buffer
to a remote peer *does* have to know that its pages *are*
currently in memory.

The application does *not* need for the virtual-to-physical
mapping of those pages to be frozen for the lifespan of the
Memory Region. But it is issuing an invitation to its peer
to perform direct writes to the advertised buffer. When the
peer decides to exercise that invitation the pages have to
be there.

An analogy: when you write a check for $100 you do not have
to identify the serial numbers of ten $10 bills, but you are
expected to have the funds in your account.

Issuing a buffer advertisement for memory you do not have
is the network equivalent of writing a check that you do
not have funds for.

Now, just as your bank may offer overdraft protection, an
RDMA device could merely report a page fault rather than
tearing down the connection itself. But that does not grant
permission for applications to advertise buffer space that
they do not have committed, it  merely helps recovery from
a programming fault.

A suspend/resume interface between the Virtual Memory Manager
and the RDMA layer allows pages to be re-arranged at the 
convenience of the Virtual Memory Manager without breaking
the application layer peer-to-peer contract. The current
interfaces that pin exact pages are really the equivalent
of having to tell the bank that when Joe cashes this $100
check that you should give him *these* ten $10 bills. It
works, but it adds too much overhead and is very inflexible.
So there are a lot of good reasons to evolve this interface
to better deal with these issues. Other areas of possible
evolution include allowing growing or trimming of Memory
Regions without invalidating their advertised handles.

But the more fundamental issue is recognizing that applications
that use direct interfaces need to know that buffers that they
enable truly have committed resources. They need a way to
ask for twenty *real* pages, not twenty pages of address
space. And they need to do it in a way that allows memory
to be rearranged or even migrated with them to a new host.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ofa-general] Re: Demand paging for memory regions

2008-02-15 Thread Caitlin Bestler
Christoph Lameter wrote
 
  Merely mlocking pages deals with the end-to-end RDMA semantics.
  What still needs to be addressed is how a fastpath interface
  would dynamically pin and unpin. Yielding pins for short-term
  suspensions (and flushing cached translations) deals with the
  rest. Understanding the range of support that existing devices
  could provide with software updates would be the next step if
  you wanted to pursue this.
 
 That is addressed on the VM level by the mmu_notifier which started
 this whole thread. The RDMA layers need to subscribe to this notifier
 and then do whatever the hardware requires to unpin and pin memory.
 I can only go as far as dealing with the VM layer. If you have any
 issues there I'd be glad to help.

There isn't much point in the RDMA layer subscribing to mmu
notifications
if the specific RDMA device will not be able to react appropriately when
the notification occurs. I don't see how you get around needing to know
which devices are capable of supporting page migration (via
suspend/resume
or other mechanisms) and which can only respond to a page migration by
aborting connections.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/2] [PATCH] kvmclock - the host part.

2008-02-15 Thread Glauber de Oliveira Costa
This is the host part of kvm clocksource implementation. As it does
not include clockevents, it is a fairly simple implementation. We
only have to register a per-vcpu area, and start writting to it periodically.

The area is binary compatible with xen, as we use the same shadow_info 
structure.

Signed-off-by: Glauber de Oliveira Costa [EMAIL PROTECTED]
---
 arch/x86/kvm/x86.c |   96 
 include/asm-x86/kvm_host.h |7 +++
 include/asm-x86/kvm_para.h |   25 +++
 include/linux/kvm.h|1 
 4 files changed, 128 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0c910c7..5dfc21f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -19,6 +19,7 @@ #include segment_descriptor.h
 #include irq.h
 #include mmu.h
 
+#include linux/clocksource.h
 #include linux/kvm.h
 #include linux/fs.h
 #include linux/vmalloc.h
@@ -424,7 +425,7 @@ static u32 msrs_to_save[] = {
 #ifdef CONFIG_X86_64
MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
 #endif
-   MSR_IA32_TIME_STAMP_COUNTER,
+   MSR_IA32_TIME_STAMP_COUNTER, MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
 };
 
 static unsigned num_msrs_to_save;
@@ -482,6 +483,70 @@ static int do_set_msr(struct kvm_vcpu *v
return kvm_set_msr(vcpu, index, *data);
 }
 
+static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock)
+{
+   static int version;
+   struct kvm_wall_clock wc;
+   struct timespec wc_ts;
+
+   if (!wall_clock)
+   return
+
+   mutex_lock(kvm-lock);
+
+   version++;
+   kvm_write_guest(kvm, wall_clock, version, sizeof(version));
+
+   wc_ts = current_kernel_time();
+   wc.wc_sec = wc_ts.tv_sec;
+   wc.wc_nsec = wc_ts.tv_nsec;
+   wc.wc_version = version;
+   kvm_write_guest(kvm, wall_clock, wc, sizeof(wc));
+
+   version++;
+   kvm_write_guest(kvm, wall_clock, version, sizeof(version));
+
+   mutex_unlock(kvm-lock);
+}
+
+static void kvm_write_guest_time(struct kvm_vcpu *v)
+{
+   struct timespec ts;
+   unsigned long flags;
+   struct kvm_vcpu_arch *vcpu = v-arch;
+   void *shared_kaddr;
+
+   if ((!vcpu-time_page))
+   return;
+
+   /* Keep irq disabled to prevent changes to the clock */
+   local_irq_save(flags);
+   kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER,
+ vcpu-hv_clock.tsc_timestamp);
+   ktime_get_ts(ts);
+   local_irq_restore(flags);
+
+   /* With all the info we got, fill in the values */
+
+   vcpu-hv_clock.system_time = ts.tv_nsec +
+(NSEC_PER_SEC * (u64)ts.tv_sec);
+   /*
+* The interface expects us to write an even number signaling that the
+* update is finished. Since the guest won't see the intermediate
+* state, we just write 2 at the end
+*/
+   vcpu-hv_clock.version = 2;
+
+   shared_kaddr = kmap_atomic(vcpu-time_page, KM_USER0);
+
+   memcpy(shared_kaddr + vcpu-time_offset, vcpu-hv_clock,
+   sizeof(vcpu-hv_clock));
+
+   kunmap_atomic(shared_kaddr, KM_USER0);
+
+   mark_page_dirty(v-kvm, vcpu-time  PAGE_SHIFT);
+}
+
 
 int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
 {
@@ -511,6 +576,27 @@ int kvm_set_msr_common(struct kvm_vcpu *
case MSR_IA32_MISC_ENABLE:
vcpu-arch.ia32_misc_enable_msr = data;
break;
+   case MSR_KVM_WALL_CLOCK:
+   vcpu-kvm-arch.wall_clock = data;
+   kvm_write_wall_clock(vcpu-kvm, data);
+   break;
+   case MSR_KVM_SYSTEM_TIME: {
+   vcpu-arch.time = data  PAGE_MASK;
+   vcpu-arch.time_offset = data  ~PAGE_MASK;
+
+   vcpu-arch.hv_clock.tsc_to_system_mul =
+   clocksource_khz2mult(tsc_khz, 22);
+   vcpu-arch.hv_clock.tsc_shift = 22;
+
+   down_read(current-mm-mmap_sem);
+   vcpu-arch.time_page =
+   gfn_to_page(vcpu-kvm, data  PAGE_SHIFT);
+   up_read(current-mm-mmap_sem);
+
+   if (is_error_page(vcpu-arch.time_page))
+   vcpu-arch.time_page = NULL;
+   break;
+   }
default:
pr_unimpl(vcpu, unhandled wrmsr: 0x%x data %llx\n, msr, data);
return 1;
@@ -569,6 +655,12 @@ int kvm_get_msr_common(struct kvm_vcpu *
case MSR_EFER:
data = vcpu-arch.shadow_efer;
break;
+   case MSR_KVM_WALL_CLOCK:
+   data = vcpu-kvm-arch.wall_clock;
+   break;
+   case MSR_KVM_SYSTEM_TIME:
+   data = vcpu-arch.time;
+   break;
default:
pr_unimpl(vcpu, unhandled rdmsr: 0x%x\n, msr);
return 1;
@@ -696,6 +788,7 @@ int kvm_dev_ioctl_check_extension(long e
case 

Re: [kvm-devel] [ofa-general] Re: Demand paging for memory regions

2008-02-15 Thread Christoph Lameter
On Fri, 15 Feb 2008, Caitlin Bestler wrote:

 There isn't much point in the RDMA layer subscribing to mmu
 notifications
 if the specific RDMA device will not be able to react appropriately when
 the notification occurs. I don't see how you get around needing to know
 which devices are capable of supporting page migration (via
 suspend/resume
 or other mechanisms) and which can only respond to a page migration by
 aborting connections.

You either register callbacks if the device can react properly or you 
dont. If you dont then the device will continue to have the problem with 
page pinning etc until someone comes around and implements the 
mmu callbacks to fix these issues.

I have doubts regarding the claim that some devices just cannot be made to 
suspend and resume appropriately. They obviously can be shutdown and so 
its a matter of sequencing the things the right way. I.e. stop the app 
wait for a quiet period then release resources etc.




-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ofa-general] Re: Demand paging for memory regions

2008-02-15 Thread Christoph Lameter
On Fri, 15 Feb 2008, Caitlin Bestler wrote:

  What does it mean that the application layer has to be determine what
  pages are registered? The application does not know which of its
 pages
  are currently in memory. It can only force these pages to stay in
  memory if their are mlocked.
  
 
 An application that advertises an RDMA accessible buffer
 to a remote peer *does* have to know that its pages *are*
 currently in memory.

Ok that would mean it needs to inform the VM of that issue by mlocking 
these pages.
 
 But the more fundamental issue is recognizing that applications
 that use direct interfaces need to know that buffers that they
 enable truly have committed resources. They need a way to
 ask for twenty *real* pages, not twenty pages of address
 space. And they need to do it in a way that allows memory
 to be rearranged or even migrated with them to a new host.

mlock will force the pages to stay in memory without requiring the OS to 
keep them where they are.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 0/2 -v(many)] kvmclock

2008-02-15 Thread Glauber de Oliveira Costa
I think this version addresses avi's last comments.
I'm not resending userspace since it is unchanged




-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ofa-general] Re: Demand paging for memory regions

2008-02-15 Thread Caitlin Bestler


 -Original Message-
 From: Christoph Lameter [mailto:[EMAIL PROTECTED]
 Sent: Friday, February 15, 2008 2:50 PM
 To: Caitlin Bestler
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED];
 [EMAIL PROTECTED]; kvm-devel@lists.sourceforge.net
 Subject: RE: [ofa-general] Re: Demand paging for memory regions
 
 On Fri, 15 Feb 2008, Caitlin Bestler wrote:
 
  There isn't much point in the RDMA layer subscribing to mmu
  notifications
  if the specific RDMA device will not be able to react appropriately
 when
  the notification occurs. I don't see how you get around needing to
 know
  which devices are capable of supporting page migration (via
  suspend/resume
  or other mechanisms) and which can only respond to a page migration
 by
  aborting connections.
 
 You either register callbacks if the device can react properly or you
 dont. If you dont then the device will continue to have the problem
 with
 page pinning etc until someone comes around and implements the
 mmu callbacks to fix these issues.
 
 I have doubts regarding the claim that some devices just cannot be
made
 to
 suspend and resume appropriately. They obviously can be shutdown and
so
 its a matter of sequencing the things the right way. I.e. stop the app
 wait for a quiet period then release resources etc.
 
 

That is true. What some devices will be unable to do is suspend
and resume in a manner that is transparent to the application.
However, for the duration required to re-arrange pages it is 
definitely feasible to do so transparently to the application.

Presumably the Virtual Memory Manager would be more willing to
take an action that is transparent to the user than one that is
disruptive, although obviously as the owner of the physical memory
it has the right to do either.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [kvm-ppc-devel] upstream PowerPC qemu breakage

2008-02-15 Thread Hollis Blanchard
On Wed, 2008-02-13 at 08:58 +0200, Avi Kivity wrote:
 It'll need to be built against your kernel tree; please provide a URL.

curl http://penguinppc.org/~hollisb/kvm/kvm-powerpc.mbox | git-am

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter [EMAIL PROTECTED] wrote:

 The invalidation of address ranges in a mm_struct needs to be
 performed when pages are removed or permissions etc change.

hm.  Do they?  Why?  If I'm in the process of zero-copy writing a hunk of
memory out to hardware then do I care if someone write-protects the ptes?

Spose so, but some fleshing-out of the various scenarios here would clarify
things.

 If invalidate_range_begin() is called with locks held then we
 pass a flag into invalidate_range() to indicate that no sleeping is
 possible. Locks are only held for truncate and huge pages.

This is so bad.

I supposed in the restricted couple of cases which you're focussed on it
works OK.  But is it generally suitable?  What if IO is in progress?  What
if other cluster nodes need to be talked to?  Does it suit RDMA?

 In two cases we use invalidate_range_begin/end to invalidate
 single pages because the pair allows holding off new references
 (idea by Robin Holt).

Assuming that there is a missing within the range in this description, I
assume that all clients will just throw up theior hands in horror and will
disallow all references to all parts of the mm.

Of course, to do that they will need to take a sleeping lock to prevent
other threads from establishing new references.  whoops.

 do_wp_page(): We hold off new references while we update the pte.
 
 xip_unmap: We are not taking the PageLock so we cannot
 use the invalidate_page mmu_rmap_notifier. invalidate_range_begin/end
 stands in.

What does stands in mean?

 Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
 Signed-off-by: Robin Holt [EMAIL PROTECTED]
 Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
 
 ---
  mm/filemap_xip.c |5 +
  mm/fremap.c  |3 +++
  mm/hugetlb.c |3 +++
  mm/memory.c  |   35 +--
  mm/mmap.c|2 ++
  mm/mprotect.c|3 +++
  mm/mremap.c  |7 ++-
  7 files changed, 51 insertions(+), 7 deletions(-)
 
 Index: linux-2.6/mm/fremap.c
 ===
 --- linux-2.6.orig/mm/fremap.c2008-02-14 18:43:31.0 -0800
 +++ linux-2.6/mm/fremap.c 2008-02-14 18:45:07.0 -0800
 @@ -15,6 +15,7 @@
  #include linux/rmap.h
  #include linux/module.h
  #include linux/syscalls.h
 +#include linux/mmu_notifier.h
  
  #include asm/mmu_context.h
  #include asm/cacheflush.h
 @@ -214,7 +215,9 @@ asmlinkage long sys_remap_file_pages(uns
   spin_unlock(mapping-i_mmap_lock);
   }
  
 + mmu_notifier(invalidate_range_begin, mm, start, start + size, 0);
   err = populate_range(mm, vma, start, size, pgoff);
 + mmu_notifier(invalidate_range_end, mm, start, start + size, 0);

To avoid off-by-one confusion the changelogs, documentation and comments
should be very careful to tell the reader whether the range includes the
byte at start+size.  I don't thik that was done?



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:00 -0800 Christoph Lameter [EMAIL PROTECTED] wrote:

 MMU notifiers are used for hardware and software that establishes
 external references to pages managed by the Linux kernel. These are
 page table entriews or tlb entries or something else that allows
 hardware (such as DMA engines, scatter gather devices, networking,
 sharing of address spaces across operating system boundaries) and
 software (Virtualization solutions such as KVM, Xen etc) to
 access memory managed by the Linux kernel.
 
 The MMU notifier will notify the device driver that subscribes to such
 a notifier that the VM is going to do something with the memory
 mapped by that device. The device must then drop references for the
 indicated memory area. The references may be reestablished later.
 
 The notification scheme is much better than the current schemes of
 avoiding the danger of the VM removing pages that are externally
 mapped. We currently either mlock pages used for RDMA, XPmem etc
 in memory or increase the refcount to pin the pages. Increasing
 the refcount makes it impossible for the VM to reclaim the page.
 
 Mlock causes problems with reclaim and may lead to OOM if too many
 pages are pinned in memory. It is also incorrect in terms what the POSIX
 specificies for what role mlock should play. Mlock does *not* pin pages in
 memory. Mlock just means do not allow the page to be moved to swap.
 
 Linux can move pages in memory (for example through the page migration
 mechanism). These pages can be moved even if they are mlocked().
 The current approach of page pinning in use by RDMA etc is conceptually
 broken but there are currently no other easy solutions.
 
 The alternate of increasing the page count to pin pages is also not
 that enticing since there will be continual attempts to reclaim
 or migrate these pages.
 
 The solution here allows us to finally fix this issue by requiring
 such devices to subscribe to a notification chain that will allow
 them to work without pinning. The VM gains control of its memory again
 and the memory that has external references can be managed like regular
 memory.
 
 This patch: Core portion
 

What is the status of getting infiniband to use this facility?

How important is this feature to KVM?

To xpmem?

Which other potential clients have been identified and how important it it
to those?


 Index: linux-2.6/Documentation/mmu_notifier/README
 ===
 --- /dev/null 1970-01-01 00:00:00.0 +
 +++ linux-2.6/Documentation/mmu_notifier/README   2008-02-14 
 22:27:19.0 -0800
 @@ -0,0 +1,105 @@
 +Linux MMU Notifiers
 +---
 +
 +MMU notifiers are used for hardware and software that establishes
 +external references to pages managed by the Linux kernel. These are
 +page table entriews or tlb entries or something else that allows
 +hardware (such as DMA engines, scatter gather devices, networking,
 +sharing of address spaces across operating system boundaries) and
 +software (Virtualization solutions such as KVM, Xen etc) to
 +access memory managed by the Linux kernel.
 +
 +The MMU notifier will notify the device driver that subscribes to such
 +a notifier that the VM is going to do something with the memory
 +mapped by that device. The device must then drop references for the
 +indicated memory area. The references may be reestablished later.
 +
 +The notification scheme is much better than the current schemes of
 +dealing with the danger of the VM removing pages.
 +We currently mlock pages used for RDMA, XPmem etc in memory or
 +increase the refcount of the pages.
 +
 +Both cause problems with reclaim and may lead to OOM if too many
 +pages are pinned in memory. Mlock is also incorrect in terms of the POSIX
 +specification of the role of mlock. Mlock does *not* pin pages in
 +memory. It just does not allow the page to be moved to swap.
 +The page refcount is used to track current users of a page struct.
 +Artificially inflating the refcount means that the VM cannot track
 +down all references to a page. It will not be able to reclaim or
 +move a page. However, the core code will try again and again because
 +the assumption is that an elevated refcount is a temporary situation.
 +
 +Linux can move pages in memory (for example through the page migration
 +mechanism). These pages can be moved even if they are mlocked().
 +So the current approach in use by RDMA etc etc is conceptually broken
 +but there are currently no other easy solutions.
 +
 +The solution here allows us to finally fix this issue by requiring
 +such devices to subscribe to a notification chain that will allow
 +them to work without pinning.
 +
 +The notifier chains provide two callback mechanisms. The
 +first one is required for any device that establishes external mappings.
 +The second (rmap) mechanism is required if a device needs to be
 +able to sleep when invalidating references. Sleeping may be 

Re: [kvm-devel] [patch 3/6] mmu_notifier: invalidate_page callbacks

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:02 -0800 Christoph Lameter [EMAIL PROTECTED] wrote:

 Two callbacks to remove individual pages as done in rmap code
 
   invalidate_page()
 
 Called from the inner loop of rmap walks to invalidate pages.
 
   age_page()
 
 Called for the determination of the page referenced status.
 
 If we do not care about page referenced status then an age_page callback
 may be be omitted. PageLock and pte lock are held when either of the
 functions is called.

The age_page mystery shallows.

It would be useful to have some rationale somewhere in the patchset for the
existence of this callback.

  #include asm/tlbflush.h
  
 @@ -287,7 +288,8 @@ static int page_referenced_one(struct pa
   if (vma-vm_flags  VM_LOCKED) {
   referenced++;
   *mapcount = 1;  /* break early from loop */
 - } else if (ptep_clear_flush_young(vma, address, pte))
 + } else if (ptep_clear_flush_young(vma, address, pte) |
 +mmu_notifier_age_page(mm, address))
   referenced++;

The | is obviously deliberate.  But no explanation is provided telling us
why we still call the callback if ptep_clear_flush_young() said the page
was recently referenced.  People who read your code will want to understand
this.

   /* Pretend the page is referenced if the task has the
 @@ -455,6 +457,7 @@ static int page_mkclean_one(struct page 
  
   flush_cache_page(vma, address, pte_pfn(*pte));
   entry = ptep_clear_flush(vma, address, pte);
 + mmu_notifier(invalidate_page, mm, address);

I just don't see how ths can be done if the callee has another thread in
the middle of establishing IO against this region of memory. 
-invalidate_page() _has_ to be able to block.  Confused.



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:04 -0800 Christoph Lameter [EMAIL PROTECTED] wrote:

 These special additional callbacks are required because XPmem (and likely
 other mechanisms) do use their own rmap (multiple processes on a series
 of remote Linux instances may be accessing the memory of a process).
 F.e. XPmem may have to send out notifications to remote Linux instances
 and receive confirmation before a page can be freed.
 
 So we handle this like an additional Linux reverse map that is walked after
 the existing rmaps have been walked. We leave the walking to the driver that
 is then able to use something else than a spinlock to walk its reverse
 maps. So we can actually call the driver without holding spinlocks while
 we hold the Pagelock.
 
 However, we cannot determine the mm_struct that a page belongs to at
 that point. The mm_struct can only be determined from the rmaps by the
 device driver.
 
 We add another pageflag (PageExternalRmap) that is set if a page has
 been remotely mapped (f.e. by a process from another Linux instance).
 We can then only perform the callbacks for pages that are actually in
 remote use.
 
 Rmap notifiers need an extra page bit and are only available
 on 64 bit platforms. This functionality is not available on 32 bit!
 
 A notifier that uses the reverse maps callbacks does not need to provide
 the invalidate_page() method that is called when locks are held.
 

hrm.

 +#define mmu_rmap_notifier(function, args...) \
 + do {\
 + struct mmu_rmap_notifier *__mrn;\
 + struct hlist_node *__n; \
 + \
 + rcu_read_lock();\
 + hlist_for_each_entry_rcu(__mrn, __n,\
 + mmu_rmap_notifier_list, hlist) \
 + if (__mrn-ops-function)   \
 + __mrn-ops-function(__mrn, args);  \
 + rcu_read_unlock();  \
 + } while (0);
 +

buggy macro: use locals.

 +#define mmu_rmap_notifier(function, args...) \
 + do {\
 + if (0) {\
 + struct mmu_rmap_notifier *__mrn;\
 + \
 + __mrn = (struct mmu_rmap_notifier *)(0x00ff);   \
 + __mrn-ops-function(__mrn, args);  \
 + }   \
 + } while (0);
 +

Same observation as in the other patch.

 ===
 --- linux-2.6.orig/mm/mmu_notifier.c  2008-02-14 21:17:51.0 -0800
 +++ linux-2.6/mm/mmu_notifier.c   2008-02-14 21:21:04.0 -0800
 @@ -74,3 +74,37 @@ void mmu_notifier_unregister(struct mmu_
  }
  EXPORT_SYMBOL_GPL(mmu_notifier_unregister);
  
 +#ifdef CONFIG_64BIT
 +static DEFINE_SPINLOCK(mmu_notifier_list_lock);
 +HLIST_HEAD(mmu_rmap_notifier_list);
 +
 +void mmu_rmap_notifier_register(struct mmu_rmap_notifier *mrn)
 +{
 + spin_lock(mmu_notifier_list_lock);
 + hlist_add_head_rcu(mrn-hlist, mmu_rmap_notifier_list);
 + spin_unlock(mmu_notifier_list_lock);
 +}
 +EXPORT_SYMBOL(mmu_rmap_notifier_register);
 +
 +void mmu_rmap_notifier_unregister(struct mmu_rmap_notifier *mrn)
 +{
 + spin_lock(mmu_notifier_list_lock);
 + hlist_del_rcu(mrn-hlist);
 + spin_unlock(mmu_notifier_list_lock);
 +}
 +EXPORT_SYMBOL(mmu_rmap_notifier_unregister);

 +/*
 + * Export a page.
 + *
 + * Pagelock must be held.
 + * Must be called before a page is put on an external rmap.
 + */
 +void mmu_rmap_export_page(struct page *page)
 +{
 + BUG_ON(!PageLocked(page));
 + SetPageExternalRmap(page);
 +}
 +EXPORT_SYMBOL(mmu_rmap_export_page);

The other patch used EXPORT_SYMBOL_GPL.



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] 卓越品质 服务五洲 - 第二届中国西部(重庆)建筑建材及节能科技展览会

2008-02-15 Thread 博瑞德国际展览
kvm-devel,欢迎阁下参观:

  
  第二届中国西部(重庆)建筑建材及节能科技展览会

  2008年3月6-8日   陈家坪 重庆展览中心

  大会网址:http://www.cnfair.org


  ◆主题展区:
  | 建筑节能玻璃及贴膜技术 | 建筑节能门窗幕墙 | 新型墙体、保温材料及砖瓦设备 | 
  | 干混砂浆技术产品及设备 | 节水与水资源利用 | 绿色照明,智能节电产品及技术 |
  | 可再生能源建筑利用技术 | 热水节能技术产品 | 供热采暖、空调及通风技术设备 | 

 
  ◆同期举办:
  第五届中国西部建筑装饰及装饰材料展
  第二届中国重庆园林景观与建筑设计展
  第四届中国西部(重庆)建筑模板、脚手架、吊蓝及高空作业机械展
  2008第二届绿色节能建筑新技术、新材料、新产品及新设备推介会


  ◆技术交流会:(更多交流活动,请参阅现场公告)

  交流单位:赫克力士贸易(上海)有限公司
  交流主题:赫克力士新产品介绍
  交流时间:2008年3月6日 下午13:30-14:20

  交流单位:何显毅(中国)工程建筑师楼有限公司
  交流单位:美国得信公司/Symons公司

  交流单位:北京环益美高分子聚合物研究所
  交流主题:生态多功能树脂型可分散胶粉的研究与应用
  交流时间:2008年3月6日 下午14:30-16:00


  ◆大会组织机构:

  重庆博瑞德展览有限公司
  电话:023-86382802  86382803  62925058  86393228
  传真:023-86393226  62925059
  网址:http://www.cnfair.org
  [EMAIL PROTECTED]  








――
【注意】上面的邮件内容与以下文字无关。本软件仅限于合法用途!
该邮件由《Volleymail邮件群发专家》软件发送;被网友评为最厉害
的邮件群发软件而多次要求破解!现免费下载,无限时间使用。
详情请访问我们的主页:http://www.cnysoft.com/-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel