Re: About the kernel configuration(CONFIG_CPU_FREQ and CONFIG_CPU_IDLE.)

2022-01-19 Thread sunshilong
Updated:
I check the frequency of the CPU by `cat /proc/cpuinfo`.

On Wed, Jan 19, 2022 at 9:23 AM 孙世龙 sunshilong 
wrote:

> Hi, list
> I am using Linux 5.4.78 with PREEMPT_RT patch.
> Hardware name: UNO-2372G-J021AE.
>
> It's strange that the frequency of the CPU is not constant(i.e.
> not fixed at a specific frequency) if I both disable CONFIG_CPU_FREQ
> and CONFIG_CPU_IDLE
> whereas
> the frequency of the CPU is constant if I only disable CONFIG_CPU_FREQ
> (leave CONFIG_CPU_IDLE enabled).
>
> Could somebody please shed some light on this matter?
>
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


About the kernel configuration(CONFIG_CPU_FREQ and CONFIG_CPU_IDLE.)

2022-01-18 Thread sunshilong
Hi, list
I am using Linux 5.4.78 with PREEMPT_RT patch.
Hardware name: UNO-2372G-J021AE.

It's strange that the frequency of the CPU is not constant(i.e.
not fixed at a specific frequency) if I both disable CONFIG_CPU_FREQ
and CONFIG_CPU_IDLE
whereas
the frequency of the CPU is constant if I only disable CONFIG_CPU_FREQ
(leave CONFIG_CPU_IDLE enabled).

Could somebody please shed some light on this matter?
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: How to acquire the occupancy time of CPUs within a certain duration, i.e. 10ms, 1ms, and etc?

2020-09-05 Thread sunshilong
HI,

Thank you for your reply.

>do you mean make a process exclusively run on a cpu core without being
interrupted/preempted for certain time?
No.
I want to know how long the CPUs are being busy in a specific duration,
e.g: 1ms?

Thank you for your attention to this matter.


On Fri, Sep 4, 2020 at 11:59 AM Mulyadi Santosa 
wrote:

>
>
> On Fri, Aug 28, 2020 at 4:14 PM 孙世龙 sunshilong 
> wrote:
>
>> Hi, list
>>
>> How to acquire the occupancy time of CPUs within a certain duration,
>> i.e. 10ms, 1ms, and etc?
>>
>> Could somebody please shed some light on this matter?
>> I would be grateful to have some help with this question.
>>
>> Best Regards.
>> Sunshilong
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>
> Hi
>
> do you mean make a process exclusively run on a cpu core without being
> interrupted/preempted for certain time?
>
> if yes, i think schedule a process as real time (sched fifo?), create
> timer, sched yield after timer runs out. cmiiw
>
> --
> regards,
>
> Mulyadi Santosa
> Freelance Linux trainer and consultant
>
> blog: the-hydra.blogspot.com
> training: mulyaditraining.blogspot.com
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>  Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#m_8954658734846331465_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


How to acquire the occupancy time of CPUs within a certain duration, i.e. 10ms, 1ms, and etc?

2020-08-28 Thread sunshilong
Hi, list

How to acquire the occupancy time of CPUs within a certain duration,
i.e. 10ms, 1ms, and etc?

Could somebody please shed some light on this matter?
I would be grateful to have some help with this question.

Best Regards.
Sunshilong

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Why disabling CONFIG_APIC may cause the kernel failing to boot up?

2020-07-19 Thread sunshilong
Hi, list

The same kernel Deb is installed on two different platforms(both are
intel X86_64 CPUs, but not the same model).
But what confuses is that it works well on one platform whereas it
causes the kernel failing to boot up on the other one.
Could you please shed some light on this matter?

Thank you for your attention to this matter.
sunshilong

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


How to comfirm whether __GI___clock_gettime() implemented as fast system call or not on x86_64 platform with Linux 4.19.84?

2020-07-19 Thread sunshilong
Hi, list

How to comfirm whether __GI___clock_gettime() implemented as fast
system call or not on x86_64 platform with Linux 4.19.84?
Here is some related backtrace log(hope there is some hint on this question):

(gdb) bt
#0  0x7ffc41da89fa in ?? ()
#1  0x7fa90785aaf0 in ?? ()
#2  0x7fa905a16876 in __GI___clock_gettime (clock_id=0,
tp=0x7fa90785aac0) at ../sysdeps/unix/clock_gettime.c:115

Thank you for your attention to this matter.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Are there some potentially serious problems that I should be aware of if I totally disable the CONFIG_ACPI option on the X86_64 platform?

2020-07-18 Thread sunshilong
Hi, Pavel
>> Are there some potentially serious problems that I should be aware of
>> if I totally disable the CONFIG_ACPI option on the X86_64 platform?
>These machines are still mostly IBM-PC compatible, so it is likely to
>somehow work. You'll likely get worse power and thermal
>management. Try it.

Thank you for taking the time to respond to my question.

Do you mean disabling the CONFIG_ACPI option is mostly on IBM-PC
compatible platforms?

Do you think disabling the CONFIG_ACPI may cause the OS failing to boot
up?

Thank you for your attention to this matter.

On Sat, Jul 4, 2020 at 8:22 PM Pavel Machek  wrote:
>
> Hi!
>
> > Are there some potentially serious problems that I should be aware of
> > if I totally disable the CONFIG_ACPI option on the X86_64 platform?
> >
> > Would it do harm to the hardware?
> >
> > Thank you for your attention to this matter.
>
> These machines are still mostly IBM-PC compatible, so it is likely to
> somehow work. You'll likely get worse power and thermal
> management. Try it.
>
> Pavel
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Are there some potentially serious problems that I should be aware of if I totally disable the CONFIG_ACPI option on the X86_64 platform?

2020-07-18 Thread sunshilong
Hi, Greg KH
>Yes, your ACPI-based system will not boot.
Do you mean the OS will not boot or some of the Linux subsystems(i.e.
USB PCI etc) will not boot?

Thank you for your attention to this matter.
On Wed, Jul 1, 2020 at 6:58 PM Greg KH  wrote:
>
> On Wed, Jul 01, 2020 at 05:15:52PM +0800, 孙世龙 sunshilong wrote:
> > Hi, list
> >
> > Are there some potentially serious problems that I should be aware of
> > if I totally disable the CONFIG_ACPI option on the X86_64 platform?
>
> Yes, your ACPI-based system will not boot.
>
> > Would it do harm to the hardware?
>
> It might, try it and see :)
>
> good luck!!!
>
> greg k-h

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: How to comprehend this code snippet: __asm__ __volatile__("rdtsc" : "=A"(t))?

2020-07-16 Thread sunshilong
Hi, Jeff

Thank you for taking the time to respond to my question.
Thanks to your help, I have a deeper understanding of this matter.

>Also read the note in the Machine Constraints for:
>   unsigned long long rdtsc (void)
>{
>  unsigned long long tick;
> __asm__ __volatile__("rdtsc":"=A"(tick));
>  return tick;
>}
>The manual says the pattern is wrong for x86_64.
Thank you for your notification. I check it and find it's for X86_32.
I didn't notice such differences before.
Here is the full related code snippet:

#ifdef CONFIG_X86_32
#define ipipe_read_tsc(t)  \
   __asm__ __volatile__("rdtsc" : "=A"(t))
#else  /* X86_64 */
#define ipipe_read_tsc(t)  do {\
   unsigned int __a,__d;   \
   asm volatile("rdtsc" : "=a" (__a), "=d" (__d)); \
   (t) = ((unsigned long)__a) | (((unsigned long)__d)<<32); \
} while(0)
#endif

>The 'A' is the constraint EAX:RDX register pair.
>Also see https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html
>and https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html.
Thanks to your attached document, I find a lot of useful information.
BTW,  What does "The A register", "The B register" and etc mean?
I googled but didn't get any useful information.
Could you please give me a few brief explanations or suggest some
documents for me to go through?

Thank you for your attention to this matter.
Looking forward to hearing from you.
Best Wishes.
sunshilong

On Thu, Jul 16, 2020 at 8:52 PM Jeffrey Walton  wrote:
>
> On Thu, Jul 16, 2020 at 8:22 AM 孙世龙 sunshilong  
> wrote:
> >
> > Here is the code snippet:
> > #define ipipe_read_tsc(t)  \
> > __asm__ __volatile__("rdtsc" : "=A"(t))
>
> I hope that is i386 only, and not x86_64.
>
> > I found that the rdtsc (Read Time-Stamp Counter) instruction is used
> > to determine how many CPU ticks took place since the processor was
> > reset.
> >
> > But what does
> > "=A"(t)
> > mean?
>
> The '=A' is a GCC machine constraint for i386 and an output operand.
> The 'A' is the constraint EAX:RDX register pair. The '=' means it is
> being written to.
>
> Also see https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html
> and https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html.
>
> Also read the note in the Machine Constraints for:
>
> unsigned long long rdtsc (void)
> {
>   unsigned long long tick;
>   __asm__ __volatile__("rdtsc":"=A"(tick));
>   return tick;
> }
>
> The manual says the pattern is wrong for x86_64.
>
> Jeff

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


How to comprehend this code snippet: __asm__ __volatile__("rdtsc" : "=A"(t))?

2020-07-16 Thread sunshilong
Hi, list

Here is the code snippet:
#define ipipe_read_tsc(t)  \
__asm__ __volatile__("rdtsc" : "=A"(t))

I found that the rdtsc (Read Time-Stamp Counter) instruction is used
to determine how many CPU ticks took place since the processor was
reset.

But what does
"=A"(t)
mean?

Thank you for your attention to this matter.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Why the trigered counts of NMIs is equivalent to PMI's?

2020-07-15 Thread sunshilong
As the subject, here is the statics got from my PC at a different time:
 $cat /proc/interrupts | grep -E 'NMI|PMI'
 NMI:649374347424   Non-maskable interrupts
 PMI:649374347424   Performance
monitoring interrupts

$ cat /proc/interrupts | grep -Ei 'PMI|NMI'
 NMI:247274238289   Non-maskable interrupts
 PMI:247274238289   Performance
monitoring interrupts

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


What are the major differences between soft lockup and hard lockup which are both detected by watchdog?

2020-07-11 Thread sunshilong
What are the major differences between soft lockup and hard lockup
which are both detected by watchdog?
Here are some example logs:
1. Watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [cantest:1669]
2. NMI watchdog: Watchdog detected hard LOCKUP on CPU 0

Best regards.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


What methods or tools could be used to investigate the causes of hard or soft lockup?

2020-07-11 Thread sunshilong
Hi, list

I have frequently faced sort lockups and hard lockups.
Here are some example logs:
1. Watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [cantest:1669]
2. NMI watchdog: Watchdog detected hard LOCKUP on CPU 0

What methods or tools could be used to investigate the causes of hard
or soft lockups?
I have been stuck with such problems for a long time.
I would be grateful if somebody shed some light on this issue.

Any advice on how to proceed?

Thank you for your attention to this matter.
Best regards.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Why are there "" and "" in the call trace section? What does them imply?

2020-07-09 Thread sunshilong
Hi, list

As the subject, here is the related log:
[72873.713472] Call Trace:
[72873.713473]  
[72873.713474]  switch_mm_irqs_off+0x31b/0x4e0
[72873.713475]  xnarch_switch_to+0x2f/0x80
[72873.713476]  ___xnsched_run.part.74+0x154/0x480
[72873.713476]  ___xnsched_run+0x35/0x50
[72873.713477]  xnintr_irq_handler+0x346/0x4c0
[72873.713478]  ? xnintr_core_clock_handler+0x1b6/0x360
[72873.713479]  dispatch_irq_head+0x8e/0x110
[72873.713479]  ? xnintr_irq_handler+0x5/0x4c0
[72873.713481]  ? dispatch_irq_head+0x8e/0x110
[72873.713482]  __ipipe_dispatch_irq+0xd9/0x1c0
[72873.713483]  __ipipe_handle_irq+0x86/0x1e0
[72873.713483]  common_interrupt+0xf/0x2c
[72873.713484]  

Maybe, the later one(i.e. ) implies there was an interrupt
request and the common_interuppt() function handler it.
Am I right?

But what about the former one(i.e. )?

Thank you for your attention to this matter.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


What are the most frequently used methods for system level-tracing?

2020-07-08 Thread sunshilong
Hi, list

What are the most frequently used methods for system level-tracing?
I would appreciate it if you could give me some related documents to go through.

Thank you for your attention to this matter.
Best regards.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: How can I investigate the cause of "watchdog: BUG: soft lockup"?

2020-07-07 Thread sunshilong
Hi,

Thank you for your help and patience.

>Jul  3 10:23:31 yx kernel: [ 1176.166204] CPU: 0 PID: 1837 Comm:
>rt_cansend Tainted: G   OE 4.19.84-solve-alc-failure #1
>Jul  3 10:23:31 yx kernel: [ 1176.166209] I-pipe domain: Linux
>Jul  3 10:23:31 yx kernel: [ 1176.166218] RIP:
>0010:queued_spin_lock_slowpath+0xd9/0x1a0
...
>  Jul  3 10:23:31 yx kernel: [ 1176.166252] Call Trace:
> Jul  3 10:23:31 yx kernel: [ 1176.166261]  _raw_spin_lock+0x20/0x30
> Jul  3 10:23:31 yx kernel: [ 1176.166270]  can_write+0x6c/0x2c0 [advcan]
> Jul  3 10:23:31 yx kernel: [ 1176.166292]  __vfs_write+0x3a/0x190

One more question, what's the relation between "queued_spin_lock_slowpath"
and "_raw_spin_lock"?

Best Regards.

孙世龙 sunshilong  于2020年7月4日周六 下午5:13写道:
>
> Hi, Valdis Klētnieks
>
> Thank you for taking the time to respond to me.
> I have a better understanding of this matter.
>
> >> Can I draw the conclusion that continually acquiring the spinlock causes 
> >> the soft
> >> lockup and the CPU has been stuck for 22s?
> >> Can I think in this way?
>
> >No.  It's been stuck for 22s *TRYING* and *FAILING* to get the spinlock.
>
> I see. So there is a thread that has held the corresponding spinlock
> for more 22s,  and a CPU is sticking(busy acquiring the spinlock) at the
> same duration.
> Can I think in this way?
>
> Thank you for your attention to this matter.
> Best Regards.
>
> Valdis Klētnieks  于2020年7月4日周六 下午4:09写道:
>
> 孙世龙 sunshilong  于2020年7月4日周六 下午5:04写道:
> >
> > Hi, Valdis Klētnieks
> >
> > Thank you for taking the time to respond to me.
> > I have a better understanding of this matter.
> >
> > >> Can I draw the conclusion that continually acquiring the spinlock causes 
> > >> the soft
> > >> lockup and the CPU has been stuck for 22s?
> > >> Can I think in this way?
> >
> > >No.  It's been stuck for 22s *TRYING* and *FAILING* to get the spinlock.
> >
> > I see. So there is a thread that has held the corresponding spinlock
> > for more 22s.
> > Can I think in this way?
> >
> > Thank you for your attention to this matter.
> > Best Regards.
> >
> > Valdis Klētnieks  于2020年7月4日周六 下午4:09写道:
> > >
> > >
> > > > Can I draw the conclusion that continually acquiring the spinlock 
> > > > causes the soft
> > > > lockup and the CPU has been stuck for 22s?
> > > > Can I think in this way?
> > >
> > > No.  It's been stuck for 22s *TRYING* and *FAILING* to get the spinlock.
> > >
> > > For comparison - spinlocks are usually used when you need a lock, but the
> > > code protected by the lock is short (things like adding to a linked list, 
> > > etc),
> > > so it should again become available in milliseconds - things where it 
> > > would take
> > > longer to put this thread to sleep and wake another one up than we expect
> > > to be waiting for this lock.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


What are the relations between queued_spin_lock_slowpath function and __raw_spin_lock function?

2020-07-06 Thread sunshilong
Hi, list

What are the relations between queued_spin_lock_slowpath function and
__raw_spin_lock function?
For your convenience, here are the URLs of the related code snippets:
https://elixir.bootlin.com/linux/v4.19.84/source/include/linux/spinlock_api_smp.h#L139

https://elixir.bootlin.com/linux/v4.19.84/source/kernel/locking/qspinlock.c#L282

Though I have carefully read the related code snippet, I still can't
answer this question. I would be grateful to have some help with this
question.

Thank you for your attention to this matter.
Best regards.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: There is "softlockup_tick()" function in the source code of linux-2.6.32, but what's the corresponding function name in the linux-4.9 or later?

2020-07-05 Thread sunshilong
Thank you for the clarification.
I have a better understanding of this matter with help.

Valdis Klētnieks  于2020年7月6日周一 上午2:58写道:
>
> On Sun, 05 Jul 2020 15:34:32 +0800, "e- d8 i> sunshilong" said:
>
> > There is "softlockup_tick()" function in the source code of
> > linux-2.6.32(refer to
> > https://elixir.bootlin.com/linux/v2.6.32.39/source/kernel/softlockup.c#L104),
> > but what's the corresponding function in the linux-4.9 or later?
> > There is not even a source code file named by softlockup.c in the
> > linux-4.9 or later?
>
> 2.6.32 was a *long* time ago. Heck, even the BKL was still around at that 
> point.
>
> [/usr/src/linux-next] git show v2.6.32
> tag v2.6.32
> Tagger: Linus Torvalds 
> Date:   Wed Dec 2 19:51:29 2009 -0800
>
> and there were a *lot* of code changes from then until v4.9.
>
> [/usr/src/linux-next] git diff --shortstat v2.6.32..v4.9
>  59438 files changed, 14713566 insertions(+), 4896973 deletions(-)
>
> Even v4.9 is from long ago and far away, and of less and less relevance
> each new Linux release.
>
> [/usr/src/linux-next] git diff --shortstat v4.9..HEAD
>  73256 files changed, 11345968 insertions(+), 4464267 deletions(-)
>
> So.. since 2.6.32. there's been some 26 million new lines of code, which is an
> interestingly high number considering that there's only 27 million lines of
> code in the tree currently.
>
> In other words, essentially *everything* has been completely re-written and
> re-designed since 2.6.32, and "What is the corresponding function" is a
> question that is probably meaningless, because whatever you're looking for 
> from
> back then has almost certainly been completely re-written with a totally new
> approach.
>
> Seriously - 2.6.32 is of interest only to software archaeologists. There is
> nothing worth looking at in there that's relevant to today's code.
>
> But to answer your question: the entire kernel.softlockup.c file was removed 
> in
> v2.6.36 because it had been replaced by entirely new code.
>

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


There is "softlockup_tick()" function in the source code of linux-2.6.32, but what's the corresponding function name in the linux-4.9 or later?

2020-07-05 Thread sunshilong
Hi, list

There is "softlockup_tick()" function in the source code of
linux-2.6.32(refer to
https://elixir.bootlin.com/linux/v2.6.32.39/source/kernel/softlockup.c#L104),
but what's the corresponding function in the linux-4.9 or later?
There is not even a source code file named by softlockup.c in the
linux-4.9 or later?

Thank you for your attention to this matter.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Are there some potentially serious problems that I should be aware of if I totally disable the CONFIG_ACPI option on the X86_64 platform?

2020-07-04 Thread sunshilong
I see. Thank you .

Pavel Machek  于2020年7月5日周日 上午4:09写道:
>
> On Sat 2020-07-04 21:34:36, 孙世龙 sunshilong wrote:
> > Thank you for taking the time to respond to me.
> >
> > >These machines are still mostly IBM-PC compatible, so it is likely to
> > >somehow work. You'll likely get worse power and thermal
> > >management. Try it.
> > It's an industrial personal computer with an Intel processor.
> > What I am worried about is that it may damage the hardware.
>
> I'd simply try it. Risk is really quite low...
>
> If in doubt, you could ask vendor.
>
> But you will not get definitive answers on the mailing list...
>
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Are there some potentially serious problems that I should be aware of if I totally disable the CONFIG_ACPI option on the X86_64 platform?

2020-07-04 Thread sunshilong
Thank you for taking the time to respond to me.

>These machines are still mostly IBM-PC compatible, so it is likely to
>somehow work. You'll likely get worse power and thermal
>management. Try it.
It's an industrial personal computer with an Intel processor.
What I am worried about is that it may damage the hardware.

Thank you for your attention to this matter.
Best Regards.

Pavel Machek  于2020年7月4日周六 下午8:22写道:
>
> Hi!
>
> > Are there some potentially serious problems that I should be aware of
> > if I totally disable the CONFIG_ACPI option on the X86_64 platform?
> >
> > Would it do harm to the hardware?
> >
> > Thank you for your attention to this matter.
>
> These machines are still mostly IBM-PC compatible, so it is likely to
> somehow work. You'll likely get worse power and thermal
> management. Try it.
>
> Pavel
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: How can I investigate the cause of "watchdog: BUG: soft lockup"?

2020-07-04 Thread sunshilong
Hi, Valdis Klētnieks

Thank you for taking the time to respond to me.
I have a better understanding of this matter.

>> Can I draw the conclusion that continually acquiring the spinlock causes the 
>> soft
>> lockup and the CPU has been stuck for 22s?
>> Can I think in this way?

>No.  It's been stuck for 22s *TRYING* and *FAILING* to get the spinlock.

I see. So there is a thread that has held the corresponding spinlock
for more 22s,  and a CPU is sticking(busy acquiring the spinlock) at the
same duration.
Can I think in this way?

Thank you for your attention to this matter.
Best Regards.

Valdis Klētnieks  于2020年7月4日周六 下午4:09写道:

孙世龙 sunshilong  于2020年7月4日周六 下午5:04写道:
>
> Hi, Valdis Klētnieks
>
> Thank you for taking the time to respond to me.
> I have a better understanding of this matter.
>
> >> Can I draw the conclusion that continually acquiring the spinlock causes 
> >> the soft
> >> lockup and the CPU has been stuck for 22s?
> >> Can I think in this way?
>
> >No.  It's been stuck for 22s *TRYING* and *FAILING* to get the spinlock.
>
> I see. So there is a thread that has held the corresponding spinlock
> for more 22s.
> Can I think in this way?
>
> Thank you for your attention to this matter.
> Best Regards.
>
> Valdis Klētnieks  于2020年7月4日周六 下午4:09写道:
> >
> >
> > > Can I draw the conclusion that continually acquiring the spinlock causes 
> > > the soft
> > > lockup and the CPU has been stuck for 22s?
> > > Can I think in this way?
> >
> > No.  It's been stuck for 22s *TRYING* and *FAILING* to get the spinlock.
> >
> > For comparison - spinlocks are usually used when you need a lock, but the
> > code protected by the lock is short (things like adding to a linked list, 
> > etc),
> > so it should again become available in milliseconds - things where it would 
> > take
> > longer to put this thread to sleep and wake another one up than we expect
> > to be waiting for this lock.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: How can I investigate the cause of "watchdog: BUG: soft lockup"?

2020-07-04 Thread sunshilong
Hi, Valdis Klētnieks

Thank you for taking the time to respond to me.
I have a better understanding of this matter.

>> Can I draw the conclusion that continually acquiring the spinlock causes the 
>> soft
>> lockup and the CPU has been stuck for 22s?
>> Can I think in this way?

>No.  It's been stuck for 22s *TRYING* and *FAILING* to get the spinlock.

I see. So there is a thread that has held the corresponding spinlock
for more 22s.
Can I think in this way?

Thank you for your attention to this matter.
Best Regards.

Valdis Klētnieks  于2020年7月4日周六 下午4:09写道:
>
>
> > Can I draw the conclusion that continually acquiring the spinlock causes 
> > the soft
> > lockup and the CPU has been stuck for 22s?
> > Can I think in this way?
>
> No.  It's been stuck for 22s *TRYING* and *FAILING* to get the spinlock.
>
> For comparison - spinlocks are usually used when you need a lock, but the
> code protected by the lock is short (things like adding to a linked list, 
> etc),
> so it should again become available in milliseconds - things where it would 
> take
> longer to put this thread to sleep and wake another one up than we expect
> to be waiting for this lock.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: How can I investigate the cause of "watchdog: BUG: soft lockup"?

2020-07-04 Thread sunshilong
Hi, Valdis Klētnieks

Thank you for your generous help.
My understanding of this matter is on a different level with your help.

>>Jul  3 10:23:31 yx kernel: [ 1176.166058] watchdog: BUG: soft lockup -
>>CPU#0 stuck for 22s! [rt_cansend:1837]
>>Jul  3 10:23:31 yx kernel: [ 1176.166066] Modules linked in:
>>..
>>Jul  3 10:23:31 yx kernel: [ 1176.166252] Call Trace:
>>Jul  3 10:23:31 yx kernel: [ 1176.166261]  _raw_spin_lock+0x20/0x30
>>Jul  3 10:23:31 yx kernel: [ 1176.166270]  can_write+0x6c/0x2c0 [advcan]
>>
>You get into function can_write() in module advcan.
>That tries to take a spinlock, while something else already has it.
Can I draw the conclusion that continually acquiring the spinlock
causes the soft
lockup and the CPU has been stuck for 22s?
Can I think in this way?

Thank you for your attention to this matter.
Best Regards.


Valdis Klētnieks  于2020年7月4日周六 下午12:39写道:
>
> > Could you please give me some hint on how to investigate the cause deeply?
>
> Shortening the call trace to the relevant lines:
>
> >  Jul  3 10:23:31 yx kernel: [ 1176.166252] Call Trace:
> > Jul  3 10:23:31 yx kernel: [ 1176.166261]  _raw_spin_lock+0x20/0x30
> > Jul  3 10:23:31 yx kernel: [ 1176.166270]  can_write+0x6c/0x2c0 [advcan]
> > Jul  3 10:23:31 yx kernel: [ 1176.166292]  __vfs_write+0x3a/0x190
>
> You get into function can_write() in module advcan.
>
> That tries to take a spinlock, while something else already has it.
>
> The spinlock call is (roughly) 15% of the way through the function 
> can_write().
>
> The 'modules linked in' list includes "advcan(OE)".
>
> The 'O' tells us it's an out-of-tree module, which means you need to talk to
> whoever wrote the module and find out why it's hanging on a spin lock (most
> likely something else is failing to release it).
>
> And that's about as far as we can hint, since we don't have the source for 
> your
> out-of-tree module.  If the people who wrote it would clean it up and get it
> into the base Linux tree, then we'd all have access to it and be able to help
> in much greater detail.
>

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


How can I investigate the cause of "watchdog: BUG: soft lockup"?

2020-07-03 Thread sunshilong
Hi, list
I encountered the error of "watchdog: BUG: soft lockup" when I sent
data through the can bus.
Could you please give me some hint on how to investigate the cause deeply?
Thank you for your attention to this matter.

The most related log(full log is seen at the footnote):
 Jul  3 10:22:36 yx kernel: [ 1120.688506] CAN[0][0] RX: FIFO overrun
Jul  3 10:23:31 yx kernel: [ 1176.166058] watchdog: BUG: soft lockup -
CPU#0 stuck for 22s! [rt_cansend:1837]
...
 Jul  3 10:23:31 yx kernel: [ 1176.166252] Call Trace:
Jul  3 10:23:31 yx kernel: [ 1176.166261]  _raw_spin_lock+0x20/0x30
Jul  3 10:23:31 yx kernel: [ 1176.166270]  can_write+0x6c/0x2c0 [advcan]
Jul  3 10:23:31 yx kernel: [ 1176.166276]  ? dequeue_signal+0xae/0x1a0
Jul  3 10:23:31 yx kernel: [ 1176.166281]  ? recalc_sigpending+0x1b/0x50
Jul  3 10:23:31 yx kernel: [ 1176.166286]  ? __set_task_blocked+0x3c/0xa0
Jul  3 10:23:31 yx kernel: [ 1176.166292]  __vfs_write+0x3a/0x190
Jul  3 10:23:31 yx kernel: [ 1176.166298]  ? apparmor_file_permission+0x1a/0x20
Jul  3 10:23:31 yx kernel: [ 1176.166302]  ? security_file_permission+0x3b/0xc0
Jul  3 10:23:31 yx kernel: [ 1176.166307]  vfs_write+0xb8/0x1b0
Jul  3 10:23:31 yx kernel: [ 1176.166312]  ksys_write+0x5c/0xe0
Jul  3 10:23:31 yx kernel: [ 1176.166316]  __x64_sys_write+0x1a/0x20
Jul  3 10:23:31 yx kernel: [ 1176.166321]  do_syscall_64+0x87/0x250
Jul  3 10:23:31 yx kernel: [ 1176.166326]
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Here is the full log:
Jul  3 10:06:16 yx kernel: [  140.313856] CAN[0][0] RX: FIFO overrun
Jul  3 10:06:59 yx kernel: [  183.323792] CAN[0][0] RX: FIFO overrun
Jul  3 10:07:42 yx kernel: [  226.329465] CAN[0][0] RX: FIFO overrun
Jul  3 10:08:24 yx kernel: [  268.362822] CAN[0][0] RX: FIFO overrun
Jul  3 10:09:07 yx kernel: [  311.372488] CAN[0][0] RX: FIFO overrun
Jul  3 10:09:50 yx kernel: [  354.377996] CAN[0][0] RX: FIFO overrun
Jul  3 10:10:32 yx kernel: [  396.411726] CAN[0][0] RX: FIFO overrun
Jul  3 10:11:15 yx kernel: [  439.421156] CAN[0][0] RX: FIFO overrun
Jul  3 10:11:58 yx kernel: [  482.426522] CAN[0][0] RX: FIFO overrun
Jul  3 10:12:40 yx kernel: [  524.460688] CAN[0][0] RX: FIFO overrun
Jul  3 10:13:23 yx kernel: [  567.469857] CAN[0][0] RX: FIFO overrun
Jul  3 10:14:06 yx kernel: [  610.475021] CAN[0][0] RX: FIFO overrun
Jul  3 10:14:48 yx kernel: [  652.509597] CAN[0][0] RX: FIFO overrun
Jul  3 10:15:31 yx kernel: [  695.518491] CAN[0][0] RX: FIFO overrun
Jul  3 10:16:14 yx kernel: [  738.523551] CAN[0][0] RX: FIFO overrun
Jul  3 10:16:55 yx kernel: [  779.558139] CAN[0][0] RX: FIFO overrun
Jul  3 10:17:38 yx kernel: [  822.566773] CAN[0][0] RX: FIFO overrun
Jul  3 10:18:21 yx kernel: [  865.571697] CAN[0][0] RX: FIFO overrun
Jul  3 10:19:03 yx kernel: [  907.607049] CAN[0][0] RX: FIFO overrun
Jul  3 10:19:46 yx kernel: [  950.615449] CAN[0][0] RX: FIFO overrun
Jul  3 10:20:29 yx kernel: [  993.620196] CAN[0][0] RX: FIFO overrun
Jul  3 10:21:11 yx kernel: [ 1035.655974] CAN[0][0] RX: FIFO overrun
Jul  3 10:21:54 yx kernel: [ 1078.664116] CAN[0][0] RX: FIFO overrun
Jul  3 10:22:36 yx kernel: [ 1120.688506] CAN[0][0] RX: FIFO overrun
Jul  3 10:23:31 yx kernel: [ 1176.166058] watchdog: BUG: soft lockup -
CPU#0 stuck for 22s! [rt_cansend:1837]
Jul  3 10:23:31 yx kernel: [ 1176.166066] Modules linked in: bnep
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic
nls_iso8859_1 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep
snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi intel_rapl
intel_soc_dts_thermal intel_soc_dts_iosf intel_powerclamp coretemp
kvm_intel snd_seq punit_atom_debug crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel snd_seq_device cryptd intel_cstate snd_timer
hci_uart snd lpc_ich advcan(OE) mei_txe btqca soundcore mei btbcm
btintel bluetooth ecdh_generic rfkill_gpio pwm_lpss_platform mac_hid
pwm_lpss parport_pc ppdev lp parport autofs4 i915 kvmgt vfio_mdev mdev
vfio_iommu_type1 vfio kvm irqbypass drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops igb drm dca ahci i2c_algo_bit
libahci video i2c_hid hid
Jul  3 10:23:31 yx kernel: [ 1176.166204] CPU: 0 PID: 1837 Comm:
rt_cansend Tainted: G   OE 4.19.84-solve-alc-failure #1
Jul  3 10:23:31 yx kernel: [ 1176.166209] I-pipe domain: Linux
Jul  3 10:23:31 yx kernel: [ 1176.166218] RIP:
0010:queued_spin_lock_slowpath+0xd9/0x1a0
Jul  3 10:23:31 yx kernel: [ 1176.166223] Code: 48 03 34 c5 00 67 37
91 48 89 16 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32
48 85 f6 74 07 0f 0d 0e eb 02 f3 90 <8b> 07 66 85 c0 75 f7 41 89 c0 66
45 31 c0 41 39 c8 0f 84 96 00 00
Jul  3 10:23:31 yx kernel: [ 1176.166226] RSP: 0018:be6f4c17bd08
EFLAGS: 0202 ORIG_RAX: ff13
Jul  3 10:23:31 yx kernel: [ 1176.166231] RAX:  RBX:
 RCX: 
Jul  3 10:23:31 yx kernel: [ 1176.166234] RDX:  RSI:
 RDI: 
Jul  3 10:23:31 yx kernel: [ 1176.166236] RBP: be6f4c17bd08 R08:
 R09

Re: How do you investigate the cause of total hang? Is there some information that I should pay attention to in order to get some hint?

2020-07-02 Thread sunshilong
Hi, Cong Wang

Thank you for taking the time to respond to me.
Do you think the message(i.e. "RCU detect a stall on CPU 2") indicates
there is a lockup.

Cong Wang  于2020年7月1日周三 下午2:07写道:


>
> On Tue, Jun 30, 2020 at 7:49 PM 孙世龙 sunshilong  
> wrote:
> >
> > Hi, list
> >
> > My x86 machine(linux4.19) sometimes hangs, suddenly not responding in
> > any way to the mouse or the keyboard.
> >
> > How can I investigate why it hung up? Is there extra information I can
> > find for a clue? Is there anything less drastic than power-off to get
> > some kind of action, if only some limited shell or just beeps,
> > but might give a clue?
> >
>
> If the hang was a crash which you didn't get a chance to capture the
> last kernel log, you can use kdump to collect them. The kernel log
> tells what kind of crash it is, a NULL pointer deref, a kernel page fault
> etc..
>
> If the hang was a hard lockup, you have to turn on lockup detector
> and also kdump to capture what the detector tells.
>
> Thanks.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Could interruptions be handled while current process context is switching?

2020-07-02 Thread sunshilong
Could interruptions be handled while current process context is switching?

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Is there some method or software that could purposely generate a lot of physical memory fragmentations on linux?

2020-07-02 Thread sunshilong
Hi, Valdis Klētnieks, Mulyadi Santosa

Thanks to both of you.
>> Just pseudo idea, if this is in user space, try to:  allocate many blocks
>> of memory using malloc, each having different size, keep the returned
>> pointer, then randomly free() some of them, then malloc() again with
>> different size

>That will cause userspace malloc() to have fragmentation, but as far
>as the kernel is concerned it's all just 4K pages of user memory.
>Causing physical memory fragmentation will require abusing the kernel
>memory allocators such as kmalloc() and vmalloc() and friends.

I fully understand what you mean by "cause userspace malloc() to have
fragmentation".
I am sorry, maybe I mislead you. I just want there are no available free
high order blocks(i.e
32KB,64KB, 128KB and etc) on the platform.
How can I more efficiently and automatically achieve this goal?
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Are there some potentially serious problems that I should be aware of if I totally disable the CONFIG_ACPI option on the X86_64 platform?

2020-07-01 Thread sunshilong
Hi, Gred
Thank you for taking the time to respond to me.

>> Would it do harm to the hardware?
> It might, try it and see :)
It's really bad news.


孙世龙 sunshilong  于2020年7月1日周三 下午8:24写道:
>
> Hi, Gred
> Thank you for taking the time to respond to me.
> >> Would it do harm to the hardware?
> > It might, try it and see :)
> It's really bad news.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Are there some potentially serious problems that I should be aware of if I totally disable the CONFIG_ACPI option on the X86_64 platform?

2020-07-01 Thread sunshilong
Hi, Gred
Thank you for taking the time to respond to me.
>> Would it do harm to the hardware?
> It might, try it and see :)
It's really bad news.
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Are there some potentially serious problems that I should be aware of if I totally disable the CONFIG_ACPI option on the X86_64 platform?

2020-07-01 Thread sunshilong
Hi, list

Are there some potentially serious problems that I should be aware of
if I totally disable the CONFIG_ACPI option on the X86_64 platform?

Would it do harm to the hardware?

Thank you for your attention to this matter.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


How do you investigate the cause of total hang? Is there some information that I should pay attention to in order to get some hint?

2020-06-30 Thread sunshilong
Hi, list

My x86 machine(linux4.19) sometimes hangs, suddenly not responding in
any way to the mouse or the keyboard.

How can I investigate why it hung up? Is there extra information I can
find for a clue? Is there anything less drastic than power-off to get
some kind of action, if only some limited shell or just beeps,
but might give a clue?

Thank you for your attention to this matter.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Is there some method or software that could purposely generate a lot of physical memory fragmentations on linux?

2020-06-29 Thread sunshilong
Hi, list
  Is there some method or software that could purposely generate a lot
of physical memory fragmentations on Linux?

 I need to do some tests under such circumstances.

Thank you for your attention to this matter.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Are there some potential problems that I should be aware of if I allocate the memory which doesn't have any relation to peripheral hardwares(i.e. DMA, PCI, serial port and etc) by vmalloc() instea

2020-06-26 Thread sunshilong
dump_stack+0x9e/0xc8
[22041.387718]  warn_alloc+0x100/0x190
[22041.387725]  __alloc_pages_slowpath+0xb93/0xbd0
[22041.387732]  __alloc_pages_nodemask+0x26d/0x2b0
[22041.387739]  alloc_pages_current+0x6a/0xe0
[22041.387744]  kmalloc_order+0x18/0x40
[22041.387748]  kmalloc_order_trace+0x24/0xb0
[22041.387754]  __kmalloc+0x20e/0x230
[22041.387759]  ? __vmalloc_node_range+0x171/0x250
[22041.387765]  xnheap_init+0x87/0x200
[22041.387770]  ? remove_process+0xc0/0xc0
[22041.387775]  cobalt_umm_init+0x61/0xb0
[22041.387779]  cobalt_process_attach+0x64/0x4c0
[22041.387784]  ? snprintf+0x45/0x70
[22041.387790]  ? security_capable+0x46/0x60
[22041.387794]  bind_personality+0x5a/0x120
[22041.387798]  cobalt_bind_core+0x27/0x60
[22041.387803]  CoBaLt_bind+0x18a/0x1d0
[22041.387812]  ? handle_head_syscall+0x3f0/0x3f0
[22041.387816]  ipipe_syscall_hook+0x119/0x340
[22041.387822]  __ipipe_notify_syscall+0xd3/0x190
[22041.387827]  ? __x64_sys_rt_sigaction+0x7b/0xd0
[22041.387832]  ipipe_handle_syscall+0x3e/0xc0
[22041.387837]  do_syscall_64+0x3b/0x250
[22041.387842]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[22041.387847] RIP: 0033:0x7ff3d074e481
[22041.387852] Code: 89 c6 48 8b 05 10 6b 21 00 c7 04 24 00 00 00 a4
8b 38 85 ff 75 43 bb 00 00 00 10 c7 44 24 04 11 00 00 00 48 89 e7 89
d8 0f 05  04 00 00 00 48 89 c3 e8 e2 e0 ff ff 8d 53 26 83 fa 26 0f
87 46
[22041.387855] RSP: 002b:7ffc62caf210 EFLAGS: 0246 ORIG_RAX:
1000
[22041.387860] RAX: ffda RBX: 1000 RCX: 7ff3d074e481
[22041.387863] RDX:  RSI:  RDI: 7ffc62caf210
[22041.387865] RBP: 7ff3d20a3780 R08: 7ffc62caf160 R09: 
[22041.387868] R10: 0008 R11: 0246 R12: 7ff3d0965b00
[22041.387870] R13: 01104320 R14: 7ff3d0965d40 R15: 01104050
[22041.387876] Mem-Info:
[22041.387885] active_anon:56054 inactive_anon:109301 isolated_anon:0
active_file:110190 inactive_file:91980 isolated_file:0
unevictable:9375 dirty:1 writeback:0 unstable:0
slab_reclaimable:22463 slab_unreclaimable:19122
mapped:101678 shmem:25642 pagetables:7663 bounce:0
free:456443 free_pcp:0 free_cma:0
[22041.387891] Node 0 active_anon:224216kB inactive_anon:437204kB
active_file:440760kB inactive_file:367920kB unevictable:37500kB
isolated(anon):0kB isolated(file):0kB mapped:406712kB dirty:4kB
writeback:0kB shmem:102568kB writeback_tmp:0kB unstable:0kB
all_unreclaimable? no
[22041.387893] Node 0 DMA free:15892kB min:32kB low:44kB high:56kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB writepending:0kB present:15992kB managed:15892kB
mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
[22041.387901] lowmem_reserve[]: 0 2804 3762 3762
[22041.387912] Node 0 DMA32 free:1798624kB min:5836kB low:8704kB
high:11572kB active_anon:188040kB inactive_anon:219400kB
active_file:184156kB inactive_file:346776kB unevictable:24900kB
writepending:0kB present:3017476kB managed:2927216kB mlocked:24900kB
kernel_stack:1712kB pagetables:7564kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
[22041.387920] lowmem_reserve[]: 0 0 958 958
[22041.387930] Node 0 Normal free:11256kB min:1992kB low:2972kB
high:3952kB active_anon:36084kB inactive_anon:218100kB
active_file:257220kB inactive_file:21148kB unevictable:12600kB
writepending:4kB present:1048576kB managed:981268kB mlocked:12600kB
kernel_stack:5280kB pagetables:23088kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
[22041.387938] lowmem_reserve[]: 0 0 0 0
[22041.387948] Node 0 DMA: 3*4kB (U) 3*8kB (U) 1*16kB (U) 1*32kB (U)
3*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M)
3*4096kB (M) = 15892kB
[22041.387990] Node 0 DMA32: 14912*4kB (UME) 13850*8kB (UME) 9325*16kB
(UME) 5961*32kB (UME) 3622*64kB (UME) 2359*128kB (UME) 1128*256kB
(UME) 524*512kB (M) 194*1024kB (UM) 0*2048kB 0*4096kB = 1799872kB
[22041.388033] Node 0 Normal: 1643*4kB (UME) 71*8kB (UME) 47*16kB (UM)
35*32kB (M) 38*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 11572kB
[22041.388071] Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=2048kB
[22041.388073] 232507 total pagecache pages
[22041.388077] 7 pages in swap cache
[22041.388079] Swap cache stats: add 1015, delete 1008, find 0/1
[22041.388081] Free swap  = 995068kB
[22041.388083] Total swap = 999420kB
[22041.388086] 1020511 pages RAM
[22041.388088] 0 pages HighMem/MovableOnly
[22041.388090] 39417 pages reserved
[22041.388092] 0 pages hwpoisoned


>
> On Sat, Jun 27, 2020 at 01:16:50PM +0800, 孙世龙 sunshilong wrote:
> > >So as per the above - you allocate one struct array at driver load time for
> > >this stuff.  You already know how big the structure/array has to be based 
> > >on
> > >the maximum number of devices or whatever you're trying to track.
> > >And 

Re: Are there some potential problems that I should be aware of if I allocate the memory which doesn't have any relation to peripheral hardwares(i.e. DMA, PCI, serial port and etc) by vmalloc() instea

2020-06-26 Thread sunshilong
Hi, Valdis Klētnieks,Greg KH

Thanks a lot to both of you.

>As I mentioned a few days ago, the fact that a high-order allocation failed
>does not necessarily mean a total failure, as often the driver can instead
>allocate several smaller areas.
The related code snippet is not used for a driver, it's a part of the real-time
system.I don't modify too many code snippets of it.

>First, Linux is not a real-time OS.  Second, "real time" doesn't automatically
>imply those two options have to be disabled.  It's trickier to do it when you
>have hard real-time limits, but it's not impossible.
Excuse me, what do you mean by "It's trickier to do it when you have hard
real-time limits, but it's not impossible."? Could you please explain that in
simpler words?

I have to disable these options since the real-time patch requires to do it.

>Having said that about migration and compaction, anybody sane who's writing for
>an actual real-time system already knows *exactly* how much memory the system
>will use, and the critical allocations are done at system startup to guarantee
>that (a) the allocations succeed and (b) most of the memory is pre-allocated
>with little chance of causing fragmentation.
I fully understand what you mean. Your conclusion is quite right, but
there needs
a hypothesis(i.e. your have only a real-time process on the platform and don't
restart it now and then. What would happen if many real-time processes and
non-real-time processes are running on the same platform? What would happen
if the testers restart them now and then? it causes memory
fragmentation indeed.)
lie behind it.

And "exactly how much memory of each program(i.e. instead of system) will use"
is determined by the specific applications(i.e. only the user
application programmers
know how much memory the program needs). But the memory should be allocated
before the app is started(i.e. before the entry of the main()
function), you should not
use malloc() to acquire memory in your application code snippet.For
details, see the
example below.

>So as per the above - you allocate one struct array at driver load time for
>this stuff.  You already know how big the structure/array has to be based on
>the maximum number of devices or whatever you're trying to track.
>And if you don't know the maximum, you're not doing real time programming. Or
>at least not correctly.
Not at the driver load time, but the load time of the real-time
process(i.e. before
the entry of the main() function). It needs to allocate(i.e. use
vmalloc) a huge memory
(i.e. for example 80MB, maybe 50MB (how much memory is suitable is decided by
the specific applications.) used by the user application later. And
that's ok to allocate
so huge memory size by vmalloc() and no error complained by the kernel.
As it needs a struct array to identify the usage of the said huge
memory, the real-time
patch uses kmalloc() to do the allocation and the page allocation
failure occurs.
I think there is no need to use kmalloc() at all. I want to use
vmalloc() or kmalloc()
instead of kmalloc() despite kmalloc() is more efficient.
How do you think about it?

Thank you for your generous help.
Look forward to hearing from you.
Best Regards.
sunshilong

Valdis Klētnieks  于2020年6月27日周六 上午1:22写道:
>
> On Fri, 26 Jun 2020 23:36:05 +0800, 孙世龙 sunshilong said:
> > Thank you for your attention to this matter.
> >
> > >Why are you having so many issues in allocating memory?
> > I often saw the page allocation failure recently. I must resolve this 
> > problem.
>
> As I mentioned a few days ago, the fact that a high-order allocation failed
> does not necessarily mean a total failure, as often the driver can instead
> allocate several smaller areas.
>
> > I have no choice other than disabling these options
> > (i.e. CONFIG-MIGRATION and CONFIG-COMPACTION)
> > since I am using a real-time OS.
>
> First, Linux is not a real-time OS.  Second, "real time" doesn't automatically
> imply those two options have to be disabled.  It's trickier to do it when you
> have hard real-time limits, but it's not impossible.
>
> > The current code snippet is using kmalloc() and often encounter the
> > aforementioned problem.
>
> Having said that about migration and compaction, anybody sane who's writing 
> for
> an actual real-time system already knows *exactly* how much memory the system
> will use, and the critical allocations are done at system startup to guarantee
> that (a) the allocations succeed and (b) most of the memory is pre-allocated
> with little chance of causing fragmentation.
>
> > So I want to use vmalloc() instead of kmalloc(). What do you think about it?
> > The memory to be allocated doesn't have any r

Re: Are there some potential problems that I should be aware of if I allocate the memory which doesn't have any relation to peripheral hardwares(i.e. DMA, PCI, serial port and etc) by vmalloc() instea

2020-06-26 Thread sunshilong
Thank you for your attention to this matter.

>Why are you having so many issues in allocating memory?
I often saw the page allocation failure recently. I must resolve this problem.
I have no choice other than disabling these options
(i.e. CONFIG-MIGRATION and CONFIG-COMPACTION)
since I am using a real-time OS.
It's easier to encounter such a problem since the said options are disabled.

>Does the kernel not provide enough different ways to do this for your
driver/device/use case?
The current code snippet is using kmalloc() and often encounter the
aforementioned problem.
So I want to use vmalloc() instead of kmalloc(). What do you think about it?
The memory to be allocated doesn't have any relation to any peripheral
hardware (i.e. DMA, PCI, serial port, etc) indeed. It's just used to
store a struct
array which indicates the usage of other resources.

Best regards.

Greg KH  于2020年6月26日周五 下午10:13写道:

>
> On Fri, Jun 26, 2020 at 04:30:48PM +0800, 孙世龙 sunshilong wrote:
> > Hi, list
> >
> > Besides kmalloc() is more efficient, are there some potential problems that
> > I should be aware of if I allocate the memory which doesn't have any
> > relation to peripheral hardwares(i.e. DMA,PCI,serial port and etc) by
> > vmalloc() instead of kmalloc() to avoid the page allocation failure(caused
> > by kmalloc() while there are too much memory fragment)?
>
> It all depends on what you want to do with that memory.
>
> Why are you having so many issues in allocating memory?  Does the kernel
> not provide enough different ways to do this for your driver/device/use
> case?
>
> If not, what are you trying to do that is not fitting with the existing
> interfaces?
>
> thanks,
>
> greg k-h

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Are there some potential problems that I should be aware of if I allocate the memory which doesn't have any relation to peripheral hardwares(i.e. DMA, PCI, serial port and etc) by vmalloc() instead of

2020-06-26 Thread sunshilong
Hi, list

Besides kmalloc() is more efficient, are there some potential problems that
I should be aware of if I allocate the memory which doesn't have any
relation to peripheral hardwares(i.e. DMA,PCI,serial port and etc) by
vmalloc() instead of kmalloc() to avoid the page allocation failure(caused
by kmalloc() while there are too much memory fragment)?

Thank you for your attention to this matter.
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Besides kmalloc() is more efficient, are there some potential problems that I should be aware of if I allocate the memory which doesn't have any relation to peripheral hardwares(i.e. DMA,PCI,serial po

2020-06-26 Thread sunshilong
Hi, list

Besides kmalloc() is more efficient, are there some potential problems that
I should be aware of if I allocate the memory which doesn't have any
relation to peripheral hardwares(i.e. DMA,PCI,serial port and etc) by
vmalloc() instead of kmalloc() to avoid the page allocation failure(caused
by kmalloc() while there are too much memory fragment)?

Thank you for your attention to this matter.
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Are there still some methods that could be used by the Linux kernel to reduce memory fragmentation while both CONFIG-MIGRATION and CONFIG-COMPACTION are disabled?

2020-06-23 Thread sunshilong
Are there still some methods that could be used by the Linux kernel
to reduce memory fragmentation while both CONFIG-MIGRATION
and CONFIG-COMPACTION are disabled?

Are there some system settings that could make for this goal?
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


What are the potential problems if the option CONFIG_MIGRATION is disabled?

2020-06-22 Thread sunshilong
As the subject, I find it has some relation with the error: "page
allocation failure: order:9".This error occurs frequently if this option is
disabled. Could memory fragmentation still be reduced by other methods if
this option is disabled?
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: How do you comprehend the saying that the kernel's memory is not pageable whereas get_free_page use a page-oriented technique?

2020-06-20 Thread sunshilong
>> Unfortunately for kernel developers, allocating memory in the kernel
>> is not as simple as allocating memory in userspace. A number of
>> factors contribute to the complication, among them:
>> The kernel is limited to about 1GB of virtual and physical memory.
>> **The kernel's memory is not pageable.**
>
>> If a module needs to allocate big chunks of memory, it is usually
>> better to use a page-oriented technique.
>
>Due to memory fragmentation, if a module needs (say) 2M of
>memory for an I/O buffer, it's more likely to be able to allocate
>512 4K pages scattered through the 1GB of memory than it
>is to get 1 contiguous chunk of memory.
>
>The fact it's not pageable doesn't mean that pages aren't relevant
>as the unit of allocation.

Thank you for your help.

 For the saying that the kernel's memory is not pageable, I think the
word "pageable" should be understood in this way:
Linux keeps the whole kernel in physical memory at all the times and no
such memory would be temporarily moved to swap even if it is not currently
in use.
Am I right?

   If my understanding is right, one more question raises, what about the
memory related to the user processes, e.g. process control block?
Can it be swapped out?

Thank you for your attention to this matter.
Look forward to hearing from you.

Valdis Klētnieks  于2020年6月20日周六 下午5:11写道:

> On Sat, 20 Jun 2020 14:18:21 +0800, 孙世龙 sunshilong said:
>
> > Unfortunately for kernel developers, allocating memory in the kernel
> > is not as simple as allocating memory in userspace. A number of
> > factors contribute to the complication, among them:
> > The kernel is limited to about 1GB of virtual and physical memory.
> > **The kernel's memory is not pageable.**
>
> > If a module needs to allocate big chunks of memory, it is usually
> > better to use a page-oriented technique.
>
> Due to memory fragmentation, if a module needs (say) 2M of
> memory for an I/O buffer, it's more likely to be able to allocate
> 512 4K pages scattered through the 1GB of memory than it
> is to get 1 contiguous chunk of memory.
>
> The fact it's not pageable doesn't mean that pages aren't relevant
> as the unit of allocation.
>
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


How do you comprehend the saying that the kernel's memory is not pageable whereas get_free_page use a page-oriented technique?

2020-06-19 Thread sunshilong
As per the documentation( https://www.linuxjournal.com/article/6930 ),
which says[emphasis mine]:
Unfortunately for kernel developers, allocating memory in the kernel
is not as simple as allocating memory in userspace. A number of
factors contribute to the complication, among them:
The kernel is limited to about 1GB of virtual and physical memory.
**The kernel's memory is not pageable.**

As per the documentation(https://www.oreilly.com

/library/view/linux-device-drivers/0596005903/ch08.html

),
which says:
get_free_page and Friends
If a module needs to allocate big chunks of memory, it is usually
better to use a page-oriented technique.

I am confused after I have seen these two sayings.
I think they have the opposite meaning:
the former one clams that the kernel's memory is not pageable
whereas the latter one implicitly states there is a page-oriented
technique  used by the kernel(i.e. function  get_free_page
depends a page-oriented technique).

I have thought and thought about it for a long time, but I still don't
comprehend them.I would be grateful to have some help with this
question.

Thank you for your attention to this matter.
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Why does “page allocation failure” occur whereas there are still “58*4096kB (C)” could be used?

2020-06-19 Thread sunshilong
>> Why doesn't the kernel use two memory blocks whose size is
2048KB(i.e.*oder 9 *)
>> instead of one block *order 10 *(you see, there are still three free
blocks and
>>  2048KB*2=4096KB equivalent to the memory size of order 10)?
>
>Most parts of the kernel, when asking for very high-order allocations,
*will*
>have a fallback strategy to use smaller chunks. So, for instance,  if a
device
>need a 1M buffer and supports scatter-gather operations, if 1M of
contiguous
>memory isn't available, the kernel can ask for 4 256K chunks and have the
I/O
>directed into the 4 areas.  *However, if the memory *has* to be contiguous
(for*
>*example, no scatter/gather available, or it's for an array data
structure),*
>then it can't do that.

Thank you for the clarification.
I understand it on a deeper level with your help.

How can I know whether scatter/gather is available or not?
In another word, when it's available and when it's not?
I do not intend to ask the behavior of gadget driver.
I just wonder how I can confirm it in general.

Thank you for your attention to this matter.
Look forward to hearing from you.
Best regards.

Valdis Klētnieks  于2020年6月19日周五 下午3:14写道:

> On Fri, 19 Jun 2020 14:56:20 +0800, 孙世龙 sunshilong said:
>
> > Why doesn't the kernel use two memory blocks whose size is
> 2048KB(i.e.*oder 9 *)
> > instead of one block *order 10 *(you see, there are still three free
> blocks and
> >  2048KB*2=4096KB equivalent to the memory size of order 10)?
>
> Most parts of the kernel, when asking for very high-order allocations,
> *will*
> have a fallback strategy to use smaller chunks. So, for instance,  if a
> device
> need a 1M buffer and supports scatter-gather operations, if 1M of
> contiguous
> memory isn't available, the kernel can ask for 4 256K chunks and have the
> I/O
> directed into the 4 areas.  However, if the memory *has* to be contiguous
> (for
> example, no scatter/gather available, or it's for an array data structure),
> then it can't do that.
>
> And in fact, that fallback could very well have happened in this case - I
> didn't bother chasing back to see if the gadget driver does recovery by
> allocating multiple smaller chunks.
>
> (That's a good "exercise for the student"... :)
>
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


What the differences between "__GFP_NOFAIL" and "__GFP_REPEAT"?

2020-06-19 Thread sunshilong
As per the documentation( https://www.linuxjournal.com/article/6930),
which says:
FlagDescription
__GFP_REPEAT   The kernel repeats the allocation if it fails.
__GFP_NOFAIL   The kernel can repeat the allocation.

So, both of them may cause the kernel to repeat the allocation operation.
How can I choose between them?
What are the major differences?

Thank you for your attention to this matter.
Looking forward to hearing from you.
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


About the uage of "__GFP_COMP"?

2020-06-19 Thread sunshilong
Hi,
I found nothing useful but only comment on "__GFP_COMP" through
out the source code of kernel, which says:"__GFP_COMP address compound
page metadata."

I googled it, but still confused.

Besides, I called function kzalloc with the argument of "GFP_KERNEL"
on Linux-4.19.82, but the kernel finally complains and points the option
is "GFP_KERNEL|__GFP_COMP|__GFP_ZERO". I understand why there is an
option of "__GFP_ZERO" and "GFP_KERNEL".But where does "GFP_COMP"
come from?

Here is the related code snippet:
kzalloc(npages, GFP_KERNEL);

Here is the implementation of kzalloc:
/*** kzalloc - allocate memory. The memory is set to zero.
 * @size: how many bytes of memory are required.
 * @flags: the type of memory to allocate (see kmalloc).
 */
static inline void *kzalloc(size_t size, gfp_t flags)
{
return kmalloc(size, flags | __GFP_ZERO);
}

Here is the most related log which output by "dmesg" :
page allocation failure: order:9, mode:0x60c0c0
(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)

Here is the whole log:
[22041.387673] HelloWorldExamp: page allocation failure: order:9,
mode:0x60c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
[22041.387678] HelloWorldExamp cpuset=/ mems_allowed=0
[22041.387690] CPU: 3 PID: 27737 Comm: HelloWorldExamp Not tainted
4.19.84-bros #5
[22041.387693] Hardware name: Advantech
UNO-2372G-J021AE/UNO-2372G-J021AE, BIOS 5.6.5 09/04/2019
[22041.387695] I-pipe domain: Linux
[22041.387697] Call Trace:
[22041.387711]  dump_stack+0x9e/0xc8
[22041.387718]  warn_alloc+0x100/0x190
[22041.387725]  __alloc_pages_slowpath+0xb93/0xbd0
[22041.387732]  __alloc_pages_nodemask+0x26d/0x2b0
[22041.387739]  alloc_pages_current+0x6a/0xe0
[22041.387744]  kmalloc_order+0x18/0x40
[22041.387748]  kmalloc_order_trace+0x24/0xb0
[22041.387754]  __kmalloc+0x20e/0x230
[22041.387759]  ? __vmalloc_node_range+0x171/0x250
[22041.387765]  xnheap_init+0x87/0x200
[22041.387770]  ? remove_process+0xc0/0xc0
[22041.387775]  cobalt_umm_init+0x61/0xb0
[22041.387779]  cobalt_process_attach+0x64/0x4c0
[22041.387784]  ? snprintf+0x45/0x70
[22041.387790]  ? security_capable+0x46/0x60
[22041.387794]  bind_personality+0x5a/0x120
[22041.387798]  cobalt_bind_core+0x27/0x60
[22041.387803]  CoBaLt_bind+0x18a/0x1d0
[22041.387812]  ? handle_head_syscall+0x3f0/0x3f0
[22041.387816]  ipipe_syscall_hook+0x119/0x340
[22041.387822]  __ipipe_notify_syscall+0xd3/0x190
[22041.387827]  ? __x64_sys_rt_sigaction+0x7b/0xd0
[22041.387832]  ipipe_handle_syscall+0x3e/0xc0
[22041.387837]  do_syscall_64+0x3b/0x250
[22041.387842]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[22041.387847] RIP: 0033:0x7ff3d074e481
[22041.387852] Code: 89 c6 48 8b 05 10 6b 21 00 c7 04 24 00 00 00 a4 8b
38 85 ff 75 43 bb 00 00 00 10 c7 44 24 04 11 00 00 00 48 89 e7 89 d8 0f 05
 04 00 00 00 48 89 c3 e8 e2 e0 ff ff 8d 53 26 83 fa 26 0f 87 46
[22041.387855] RSP: 002b:7ffc62caf210 EFLAGS: 0246 ORIG_RAX:
1000
[22041.387860] RAX: ffda RBX: 1000 RCX:
7ff3d074e481
[22041.387863] RDX:  RSI:  RDI:
7ffc62caf210
[22041.387865] RBP: 7ff3d20a3780 R08: 7ffc62caf160 R09:

[22041.387868] R10: 0008 R11: 0246 R12:
7ff3d0965b00
[22041.387870] R13: 01104320 R14: 7ff3d0965d40 R15:
01104050
[22041.387876] Mem-Info:
[22041.387885] active_anon:56054 inactive_anon:109301 isolated_anon:0
active_file:110190 inactive_file:91980 isolated_file:0
unevictable:9375 dirty:1 writeback:0 unstable:0
slab_reclaimable:22463 slab_unreclaimable:19122
mapped:101678 shmem:25642 pagetables:7663 bounce:0
free:456443 free_pcp:0 free_cma:0
[22041.387891] Node 0 active_anon:224216kB inactive_anon:437204kB
active_file:440760kB inactive_file:367920kB unevictable:37500kB
isolated(anon):0kB isolated(file):0kB mapped:406712kB dirty:4kB
writeback:0kB shmem:102568kB writeback_tmp:0kB unstable:0kB
all_unreclaimable? no
[22041.387893] Node 0 DMA free:15892kB min:32kB low:44kB high:56kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB writepending:0kB present:15992kB managed:15892kB
mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
[22041.387901] lowmem_reserve[]: 0 2804 3762 3762
[22041.387912] Node 0 DMA32 free:1798624kB min:5836kB low:8704kB
high:11572kB active_anon:188040kB inactive_anon:219400kB
active_file:184156kB inactive_file:346776kB unevictable:24900kB
writepending:0kB present:3017476kB managed:2927216kB mlocked:24900kB
kernel_stack:1712kB pagetables:7564kB bounce:0kB free_pcp:0kB local_pcp:0kB
free_cma:0kB
[22041.387920] lowmem

Why did not the kernel use the memory block named by "Node 0 DMA" while the argument of funtion kzalloc is "GFP_KERNEL"?

2020-06-19 Thread sunshilong
As per the documentation
(https://elixir.bootlin.com/linux/latest/source/include/linux/gfp.h#L292),
which says:
#define GFP_KERNEL (__GFP_RECLAIM | __GFP_IO | __GFP_FS).
It does not explicitly bind the option of GFP_KERNEL to any of the
physical address zone modifiers(i.e. __GFP_DMA,__GFP_HIGHMEM,
__GFP_DMA32,__GFP_MOVABLE,GFP_ZONEMASK) indeed.

And there are free blocks in "Node 0 DMA" indeed.
For your convenience, the most related log is seen below:
Node 0 DMA: 3*4kB (U) 3*8kB (U) 1*16kB (U) 1*32kB (U) 3*64kB(U) 0*128kB
1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB
Node 0 DMA32: 14912*4kB (UME) 13850*8kB (UME) 9325*16kB (UME)
5961*32kB(UME) 3622*64kB (UME) 2359*128kB (UME) 1128*256kB (UME)
524*512kB (M) 194*1024kB (UM) 0*2048kB 0*4096kB = 1799872kB
[22041.388033] Node 0 Normal: 1643*4kB (UME) 71*8kB (UME) 47*16kB (UM)
35*32kB (M) 38*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 11572kB

Here is the implementation of the function kzalloc(refer to
https://elixir.bootlin.com/linux/latest/source/include/linux/slab.h#L667):
/**
 * kzalloc - allocate memory. The memory is set to zero.
 * @size: how many bytes of memory are required.
 * @flags: the type of memory to allocate (see kmalloc).
 */
static inline void *kzalloc(size_t size, gfp_t flags)
{
return kmalloc(size, flags | __GFP_ZERO);
}
So I wonder why the kernel did not use the memory block named by
"Node 0 DMA" while the argument of function kzalloc is "GFP_KERNEL".
I heard a saying is that the Linux kernel "will" search the "normal zone"
first, then the "DMA32 zone", and "DMA zone" while there is no "physical
address zone modifier" is explicitly declared.
I have googled it for a long time. But I still could not understand why the
kernel still complains. I would be grateful to have some help with it.
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Why the kernel could not use the memory block contained by "Node 0 DMA" while the argument of kzalloc() is "GFP_KERNEL"?

2020-06-19 Thread sunshilong
As per the documentation(
https://elixir.bootlin.com/linux/latest/source/include/linux/gfp.h#L292),
which says:
#define GFP_KERNEL
 (__GFP_RECLAIM
 | __GFP_IO
 | __GFP_FS
).
It does not explicitly bind the option of *GFP_KERNEL *to any of the
physical address zone modifiers(i.e.
__GFP_DMA,__GFP_HIGHMEM,__GFP_DMA32,__GFP_MOVABLE,GFP_ZONEMASK) indeed.

And there are free blocks in "Node 0 DMA" indeed. For your convenience, the
most related log is seen below:


*[22041.387948] Node 0 DMA: 3*4kB (U) 3*8kB (U) 1*16kB (U) 1*32kB (U)
3*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB
(M) = 15892kB[22041.387990] Node 0 DMA32: 14912*4kB (UME) 13850*8kB (UME)
9325*16kB (UME) 5961*32kB (UME) 3622*64kB (UME) 2359*128kB (UME) 1128*256kB
(UME) 524*512kB (M) 194*1024kB (UM) 0*2048kB 0*4096kB =
1799872kB[22041.388033] Node 0 Normal: 1643*4kB (UME) 71*8kB (UME) 47*16kB
(UM) 35*32kB (M) 38*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 11572kB  *

Here is the implementation of the function *kzalloc(*refer to
https://elixir.bootlin.com/linux/latest/source/include/linux/slab.h#L667*)*:

/** * kzalloc - allocate memory. The memory is set to zero. * @size:
how many bytes of memory are required. * @flags: the type of memory to
allocate (see kmalloc). */static inline void *kzalloc
(size_t
 size, gfp_t
 flags){
return kmalloc
(size, flags
| __GFP_ZERO );}

So I wonder why the kernel did not use the memory block contained by "Node
0 DMA" while the argument of kzalloc() is "GFP_KERNEL".
I heard a saying is that the Linux kernel "will" search the "normal zone"
first, then the "DMA32 zone", and "DMA zone" while there is no "physical
address zone modifier" is explicitly declared.
I have googled it for a long time. But I still could understand why this
occurs. I would be grateful to have some help with it.


*Here is the whole log:*
[22041.387673] HelloWorldExamp: page allocation failure: order:9,
mode:0x60c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
[22041.387678] HelloWorldExamp cpuset=/ mems_allowed=0
[22041.387690] CPU: 3 PID: 27737 Comm: HelloWorldExamp Not tainted
4.19.84-bros #5
[22041.387693] Hardware name: Advantech UNO-2372G-J021AE/UNO-2372G-J021AE,
BIOS 5.6.5 09/04/2019
[22041.387695] I-pipe domain: Linux
[22041.387697] Call Trace:
[22041.387711]  dump_stack+0x9e/0xc8
[22041.387718]  warn_alloc+0x100/0x190
[22041.387725]  __alloc_pages_slowpath+0xb93/0xbd0
[22041.387732]  __alloc_pages_nodemask+0x26d/0x2b0
[22041.387739]  alloc_pages_current+0x6a/0xe0
[22041.387744]  kmalloc_order+0x18/0x40
[22041.387748]  kmalloc_order_trace+0x24/0xb0
[22041.387754]  __kmalloc+0x20e/0x230
[22041.387759]  ? __vmalloc_node_range+0x171/0x250
[22041.387765]  xnheap_init+0x87/0x200
[22041.387770]  ? remove_process+0xc0/0xc0
[22041.387775]  cobalt_umm_init+0x61/0xb0
[22041.387779]  cobalt_process_attach+0x64/0x4c0
[22041.387784]  ? snprintf+0x45/0x70
[22041.387790]  ? security_capable+0x46/0x60
[22041.387794]  bind_personality+0x5a/0x120
[22041.387798]  cobalt_bind_core+0x27/0x60
[22041.387803]  CoBaLt_bind+0x18a/0x1d0
[22041.387812]  ? handle_head_syscall+0x3f0/0x3f0
[22041.387816]  ipipe_syscall_hook+0x119/0x340
[22041.387822]  __ipipe_notify_syscall+0xd3/0x190
[22041.387827]  ? __x64_sys_rt_sigaction+0x7b/0xd0
[22041.387832]  ipipe_handle_syscall+0x3e/0xc0
[22041.387837]  do_syscall_64+0x3b/0x250
[22041.387842]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[22041.387847] RIP: 0033:0x7ff3d074e481
[22041.387852] Code: 89 c6 48 8b 05 10 6b 21 00 c7 04 24 00 00 00 a4 8b 38
85 ff 75 43 bb 00 00 00 10 c7 44 24 04 11 00 00 00 48 89 e7 89 d8 0f 05
 04 00 00 00 48 89 c3 e8 e2 e0 ff ff 8d 53 26 83 fa 26 0f 87 46
[22041.387855] RSP: 002b:7ffc62caf210 EFLAGS: 0246 ORIG_RAX:
1000
[22041.387860] RAX: ffda RBX: 1000 RCX:
7ff3d074e481
[22041.387863] RDX:  RSI:  RDI:
7ffc62caf210
[22041.387865] RBP: 7ff3d20a3780 R08: 7ffc62caf160 R09:

[22041.387868] R10: 0008 R11: 0246 R12:
7ff3d0965b00
[22041.387870] R13: 01104320 R14: 7ff3d0965d40 R15:
01104050
[22041.387876] Mem-Info:
[22041.387885] active_anon:56054 inactive_anon:109301 isolated_anon:0
active_file:110190 inactive_file:91980 isolated_file:0
unevictable:9375 dirty:1 writeback:0 unstable:0
slab_reclaimable:22463 slab_unreclaimable:19122
  

Re: Why does “page allocation failure” occur whereas there are still “58*4096kB (C)” could be used?

2020-06-18 Thread sunshilong
>> Why does "page allocation failure" occur whereas there are still
"58*4096kB
>> (C)"(*I think it indicates there are 58 order 10 memory could be used*)
>> could be used?
>>
>> Here is the related log:
>>
>> [ 2161.623563] : page allocation failure: order:10,
>> mode:0x2084020(GFP_ATOMIC|__GFP_COMP)
>
>If you look at the source for alloc_ap_req(), you find it wants
GFP_ATOMIC, not
>CMA.  And your box is fresh out of contiguous order-10 spaces that aren't
CMA,
>and you're down to your last 3 order-9 flagged as (UEC).

Thank you for the clarification.
I understand it on a deeper level with your help.

Why doesn't the kernel use two memory blocks whose size is 2048KB(i.e.*oder
9 *)
instead of one block *order 10 *(you see, there are still three free blocks
and
 2048KB*2=4096KB equivalent to the memory size of order 10)?

>If you look at the source for alloc_ap_req(), you find it wants
GFP_ATOMIC, not
>CMA.
I followed your advice and read the related source code carefully.
It's corresponding to the log(i.e.   mode:0x2084020(GFP_ATOMIC|__GFP_COMP)
).

Thank you for your attention to this matter.
Look forward to hearing from you.
Best regards.

Valdis Klētnieks  于2020年6月19日周五 下午12:48写道:

> On Thu, 18 Jun 2020 14:21:05 +0800, sunshiong said:
>
> > Why does "page allocation failure" occur whereas there are still
> "58*4096kB
> > (C)"(*I think it indicates there are 58 order 10 memory could be used*)
> > could be used?
> >
> > Here is the related log:
> >
> > [ 2161.623563] : page allocation failure: order:10,
> > mode:0x2084020(GFP_ATOMIC|__GFP_COMP)
>
> Most likely, the allocation wanted some other type of allocation.
> The (C) on the order-10 says it's an CMA area.
>
> static const char types[MIGRATE_TYPES] = {
> [MIGRATE_UNMOVABLE] = 'U',
> [MIGRATE_MOVABLE]   = 'M',
> [MIGRATE_RECLAIMABLE]   = 'E',
> [MIGRATE_HIGHATOMIC]= 'H',
> #ifdef CONFIG_CMA
> [MIGRATE_CMA]   = 'C',
> #endif
> #ifdef CONFIG_MEMORY_ISOLATION
> [MIGRATE_ISOLATE]   = 'I',
> #endif
>
> If the call was for an unmovable, movable, reclaimable, or highatomic
> allocation, you lose.
>
> If you look at the source for alloc_ap_req(), you find it wants
> GFP_ATOMIC, not
> CMA.  And your box is fresh out of contiguous order-10 spaces that aren't
> CMA,
> and you're down to your last 3 order-9 flagged as (UEC).
>
> I admit I find it a tad suspicious that the USB gadget driver asks for a 4M
> chunk of memory.  Does USB actually support single transfers that large?
> (I'm
> not a USB expert)
>
>
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies