Richard Pajerski Software development and consulting

Possible bug with triggered NSFDBHOOK events in DOTS (on Windows 2019)

by Richard Pajerski


Posted on Friday December 23, 2022 at 01:13PM in Technology


[February 2023 update: DOTS follow-up, SPR for applet bug]

I recently took advantage of DOTS being back in the Domino 12 server, to replace a Java agent with a scheduled DOTS tasklet and have been pleased with the results.  Using tasklets is generally going to be far more efficient than using Java agents in Domino since a JVM is loaded once with the DOTS server task and remains resident in memory until the DOTS task is stopped whereas with each agent invocation a new instance of the JVM is started.  There are other benefits to using tasklets over Java agents which I may take up in a future post but for the moment, I've run across an issue on a Windows 2019 server installation.

Although the deployment above uses a scheduled tasklet, I was originally hoping to use the triggered NSFDBHOOK events in order to capture some document saves in (more/less) real time.  But while testing on Windows 2019, I noticed that the HOOK_EVENT_NOTE_UPDATE and HOOK_EVENT_NOTE_OPEN events were not being emitted at all or only very infrequently.  I had earlier tested the same tasklet on a Domino installation on a Windows 8.1 client and the events fired more/less as expected.  Aside from the OS difference, everything about the Domino installations was identical -- with one exception: the Domino program installation directory on the Windows client had no spaces but the Windows server was installed in the default C:\Program<space>Files\HCL directory.  Sure enough, after reinstalling Domino on the Windows 2019 server without the space (specifically in C:\Domino), events began firing again.  HCL has also reproduced this and may open an SPR.

In the meantime, after working a bit more with those NSF hook events, my impression is that they are not altogether reliable -- or at least, there doesn't appear to be a one-to-one correlation with each document save/open and a DOTS-generated event.  Some document saves/opens never fire an event.  The source code for the older versions of DOTS is on openntf.org here: https://stash.openntf.org/projects/DOTS/repos/dots/browse/sources but I'm not sure if this is the same code being shipped with the Domino 12 server (though I assume it's pretty close).

If I'm looking in the right place, lines 79 and 80 of the postMessage method (https://stash.openntf.org/projects/DOTS/repos/dots/browse/sources/dotsNSFHook.cpp), have:

STATUS error = MQOpen(queueName, 0, &dotsmq);
 if ( error == NOERROR ){

where DOTS presumably intercepts the necessary events from an internal Domino queue.  But what if there *are* errors here?  Will our DOTS tasklets ever know about them?  Maybe errors are unlikely here but perhaps this is the source of some missed events.



Comments:

The NSF Hook technology is a quite old technology, which has been replaced by Extension ;Manager technology long time ago.
I am surprised there is still software using it.

Message Queues (MQ) are a core functionality in Notes/Domino used in many places of Notes/Domino for process to process communication. What would you do when this fails? For sure there should be logging and an application could try to keep events and try again. But this is this should be a real time event and delaying it, would not be expected by the application.

For sure there should be proper error logging to find out why an event is not captured. But I would more look into issues with the hook driver technology and I would replace it with extension manager technology.

Posted by Daniel Nashed on December 24, 2022 at 05:56 PM EST #


2nd part of my message:

I don't see why Windows 2019 should make a difference. This might be a coincident and your problem might have a different root cause.
And I would add logging in the code effected before opening the MQ and there should be error logging around all calls.

But as long the MQ is available, this open request should not fail.
And as long the events are not overloaded (a MQ has a limit of 64K) there is no reason why it should fail.

So I would look for the root cause more around the hook driver.

Posted by Daniel Nashed on December 24, 2022 at 05:57 PM EST #


Hi Daniel - thanks for your comments! To be clear, to test this, I'm using a one-line annotated method in a Java tasklet:

public class DotsTesting extends AbstractServerTask {
@Triggered(eventId={IExtensionManagerEvent.HOOK_EVENT_NOTE_UPDATE} /*500*/)
public void runHook(NSFHookNoteUpdateEvent evt, IProgressMonitor arg1) throws NotesException {
logMessage("Note hook update in " + evt.getDbPath());
}
}
So I'm testing using plain Java with out-of-the-box DOTS. The logMessage call (which itself just calls System.out.println()) doesn't print anything. My reference to the C++ code was a suggestion after browsing backward through src/com/ibm/dots at OpenNTF (which I think is ultimately where IExtensionManagerEvent leads to).

Best,
Richard

Posted by Richard Pajerski on December 24, 2022 at 08:34 PM EST #


Thanks for the clarification. You are assuming some details. I am not an expert about the DOTS task itself.
But from what it looks like a NSF Hook driver is intercepting the low level Domino call-backs (hook drivers are similar to Extension Manager events).
The hook driver opens the DOTS own MQ for communicating with the DOTS tasks (sending an event with the data).

Hooking into the low level updates for documents isn't 1:1 compared to Lotus Script or Java. You might get more events. But not less.

Domino process where HOOK Driver is loaded --> Captured Event --> MQ Open to send event data --> DOTS task reads from MQ and triggers action.

I don't see where Windows 2019 could make a difference here.
Is there already a SPR? Can you share your details of the ticket offline?
You got my mail address from my post.

Posted by Daniel Nashed on December 25, 2022 at 05:11 AM EST #


Hi Daniel --

The first half of my post deals with the space in the program installation directory. I opened a case for that with HCL and they said they could reproduce the problem. That case is still open but they haven't created an SPR yet (they're still researching it). There's not much more to the ticket other than that it's easy to reproduce the problem: simply "load dots" from a Domino server console. The "TriggeredExample" sample tasklet (which ships with Domino 12) doesn't show any console output when there's a space present in the installation directory but *does* show output when the space is removed.

The second half of the post -- about triggered events in general on an installation *without* spaces -- is a separate issue. I don't think the two are directly related. Your point about the 1:1 updates was also my assumption when I started testing the event triggers but my results show that some events never get sent to the tasklet.

Thanks again for your comments!

Posted by Richard Pajerski on December 26, 2022 at 11:27 AM EST #


Your post says "Windows 2019" is causing the problem. Sounds like you only have a problem with the space in the installation path?

To understand how the trigger works, you have to understand the underlying technology in detail. Hook drivers and Extension managers capture any update.

Hook drivers should not used at all and might not be reliable not catching all event types. Extension Managers in contrast are fully supported.
But In this case the extension manager isn't written in the right way. It captures just all events for Note update + many other events and sends them to DOTS via the MQ. DOTS has to filter them.

So I can't recommend to use either way! The Hook driver nor the Extension Manager should have been added, when DOTS came back.

I spoke with one of the initial contributors of the DOTS project and he said that the hook/extension manager code was more experimental, than ready to be released.

Not sure which use case you have in mind. But I would use a different approach. Since this approach isn't scalable and isn't completely real time.

Posted by Daniel Nashed on December 29, 2022 at 04:50 AM EST #


Another second part reply:

I spoke with one of the initial contributors of the DOTS project and he said that the hook/extension manager code was more experimental, than ready to be released. And he did not know anyone using it in production.

Not sure which use case you have in mind. But I would use a different approach. Since this approach isn't scalable and isn't completely real time.

I have written highly optimized Extension Managers for special needs. But in most cases polling checking database modified time before searching works well.

Posted by Daniel Nashed on December 29, 2022 at 04:54 AM EST #


Hi Daniel -- I appreciate your input. I ended up using the scheduled tasklet approach (with some polling) which is working great and a much better option than my previous Java agent. However, I'd really like to use the triggered approach -- it seems cleaner overall for my use case.

What exactly is wrong with the way the current extension manager is written? If there are known problems, one would hope that HCL would put a proviso in the documentation somewhere...

Does the "initial contributor" you spoke with work at HCL or have some sway there? :-)

By the way, it's Ok to post more than 1,000 characters in your comments -- the blog simply marks it as SPAM but doesn't delete it.

Posted by Richard Pajerski on December 29, 2022 at 11:46 AM EST #


Leave a Comment

HTML Syntax: Allowed