7. Remote job

With the remote job submission function, it is possible to execute a solver on a Linux machine ( **remote server** ) different from the machine on which Winmostar is installed.

7.1. Supported job schedulers

Winmostar supports following job schedulers.

  • TORQUE (PBS)

  • SGE, UGE

  • SLURM

  • NQS

  • NQS2

  • Winmostar Job Manager

If the corresponding job scheduler is not installed on the remote server, you can execute the remote job in the following way.

  1. Prepare commands and scripts that mimic commands such as qsub, qstat and specify the prefix of these commands with Prefix for Queuing Commands if necessary.

  2. Select Run with Queue setting.

7.2. Remote Job Setup Procedures

See Basic Operation Flow for the overall procedure to run a remote job.

For details of each function, see Each function of the Submit Job window.

  1. Install and configure the job scheduler and solver on the server where you want to run the calculations. At this stage, you must be able to SSH into the server without Winmostar and start the solver on the terminal.

    If you are planning to install it now, please refer to here.

    If you want to run jobs without using job scheduling (not recommended when running many jobs), you do not need to set up a job scheduler and should select Run in the Queue setting later.

  1. Click Submit Remote Job button | toolbar_submit | on the tool bar.

  1. Submit Remote Job window, if you want to use a profile that is already configured, select the profile to use in Profile. If you are setting up a new profile, select Manage… ‣ Add Profile.

    fig_remote_addprofile

  1. Enter the following contents on the top of window of Edit Profile.

    • Profile Name

    • Connection

      • Hostname

      • Port (normally 22 is used)

      • Timeout (Use default value if you do not know)

      • Username

      • Password

      • SSH Key (Set as required)

    fig_remote_editconnection

    Please ask us to make multi-stage SSH connection to TSUBAME, FOCUS, etc.

  1. To test the SSH connection, click the Test Connection button under the Edit Profile window. Click Connection test only when prompted “Do you want to test the connection and run the job scheduler on the remote server?”

    When a black terminal window opens and Store key in cache? (y/n) may be displayed at the first connection. In that case, enter y.

    fig_remote_storekey

    If the connection is successful, the message “Test completed successfully” will be displayed.

    If the user settings or other settings are incorrect, please review the one previous setting as the message “The test ended abnormally”. Even if you entered the correct password on the spot, please re-enter it in the Edit Profile window.

    If you see ERROR: Connection timed out or an error occurred. at the bottom of the Submit Remote Job window, please reconsider the connection settings.

  1. Enter the following contents at the bottom of the window of Edit Profile.

    • Queue & Solver

      • Queue

      • Options (command arguments for submitting jobs such as qsub)

    fig_remote_editqueue

    First select Queue for the job scheduler installed on the server you are connecting to. Next, enter the command arguments to submit the job, such as qsub or sbatch, in the Options field. The information about the resource to be allocated is set here.

    In order to improve convenience, it is recommended to enter in the template file and in Options aliases in which settings such as parallel number and file name depending on each job are assigned at job execution . For details, see Alias string available for remote job function.

  1. To test the scheduler’s operation, click the Test Connection button under the Edit Profile window. If the queue on the remote server is full and jobs are not flowing immediately, wait until the queue is free or continue. If the message “Would you like to run a connection test and test the operation of the job scheduler on the remote server?” Click Execute both when you see “Do you want to run the connection test and the Job Scheduler behavior test on the remote server?” When prompted “Enter the maximum waiting time for the Job Scheduler run test”, set the value accordingly and click OK.Click OK when prompted “Do you also want to run the (solver name) test on the remote server?”. Click No when you see “Do you want to run the test of (solver name) on the remote server as well?”

    If the connection is successful, the message “Test completed successfully” will be displayed.

    To test the behavior of the scheduler, click the Test Connection button under the Edit Profile window The Edit Profile window. If the queue on the remote server is full and jobs are not running immediately, wait until the queue is free or continue. If the message “Would you like to run a connection test and test the operation of the job scheduler on the remote server?” Click Execute both when you see “Do you want to run the connection test and the Job Scheduler behavior test on the remote server?” When prompted “Enter the maximum waiting time for the Job Scheduler run test”, set the value and click OK. Click No when you see “Do you want to run the test of (solver name) on the remote server as well?”

  1. Enter the following contents at the bottom of the window of Edit Profile.

    • Queue & Solver

      • Solver

      • Shell Script

    fig_remote_editqueue

    First select Queue as the job scheduler installed on the server to be connected and select solver to use on it with Solver. Next, click Use Template of Shell Script. If you do not have a template for the selected solver, enter the name of the template and the template will be opened in a text editor. If you have, select the template file you want to use in the pull down menu next to Use Template and click on the Edit button below it to open the template file in a text editor. Commands such as module load ..., source ..., export PATH = ... and other commands such as mpirun, enter the settings for using the solver selected on that server.

    Template scripts include module load ... ` , :command:`source ... `, :command:`export PATH=... `, and :command:`mpirun to use the selected solver on the server. Place them between # Insert commands here and # Do not modify the followings in the template script as much as possible.

    In order to improve convenience, it is recommended to enter in the template file and in Options aliases in which settings such as parallel number and file name depending on each job are assigned at job execution . For details, see Alias string available for remote job function.

  1. To test the solver’s operation, click the Test Connection button under the Edit Profile window. If the queue on the remote server is full and jobs are not flowing immediately, wait until the queue is free or continue. If the message “Would you like to run a connection test and test the operation of the job scheduler on the remote server?” Click Execute both when you see “Do you want to run the connection test and the Job Scheduler behavior test on the remote server?” When prompted “Enter the maximum waiting time for the Job Scheduler run test”, set the value accordingly and click OK. When prompted “Do you also want to run the (solver name) test on the remote server?” Click YES.

    If the connection is successful, the message “Test completed successfully” will be displayed.

    If the settings are incorrect, please review the one previous setting as it will show “Test ended abnormally”.

  1. Press the OK button to close the Edit Profile window.

  1. Close button to close the Submit Remote Job window. When prompted “Do you want to save the remote server configuration?” Click Yes.

7.3. Advanced Operation of Remote Jobs

  • In the Submit Remote Job window, click the Queue ‣ Show Usage of Each Queues menu and make sure the remote server information is displayed at the bottom of the window.

  • If you want to start a job in file mode, click the Send & Submit button. The operation here is the same as for a normal local job.

    fig_remote_sendsub_button

    The ID of the submitted job is displayed at the bottom of the window. ID is used to kill the job.

    The directory in which the job was executed on the remote server can be set in Remote Directory of Profile ‣ Edit Profile, and the one actually used is displayed on the Remote Directory field of Submit Remote Job.

    When a job is started on a remote server, standard output is output to the file winmos.o and standard error is output to the file winmos.e.

  • If you want to check the status of jobs submitted in file mode, use Queue ‣ List Submitted Jobs. If all jobs are complete, you will see ---.

    If the submitted job ends too soon, --- will be displayed even if you submit it.

  • To check the status of a specific job on a remote server, please do the following.

    • ls button

    • cat button

    • grep button

    • tail button

    • Get & Open … button

    fig_remote_progress_button

    The target job is displayed in the Remote Directory field. For the default setting, open the input file of the job you want to target in the main window and select the profile when you submitted the job.

  • If you want to run the result analysis of a job finished on a remote server in file mode on your local machine, click the Get All Files button.

    fig_remote_get_button

    The target job is displayed in the Remote Directory field. For the default setting, open the input file of the job you want to target in the main window and select the profile when you submitted the job.

    After file acquisition, result analysis can be performed by the same operation method as local job.

7.4. Each function of the Submit Job window

File menu
Revert All Changes

Discard the changes and reload the server configuration file.

Restore Setting File

Restore the server configuration file to the factory condition.

Import Setting File

It loads the server configuration file and adds the profile contained in it to the list of existing profiles.

Restore Setting File

Restore the server configuration file to the factory condition.

Close

Close this window.

Profile Menu
Add Profile, Duplicate Profile, Remove Profile

Add, duplicate, and delete server connection profiles. The same operation is possible from the Manage button in the window.

Edit Profile

Edit the profile of the server connection. Some settings can be edited directly in the Submit Job window.

Profile name

Specify the profile name displayed in the Submit Job window.

Hostname

Specify the host name or IP address of the remote server.

Port

Specify the port number used for connection.

Timeout

Specify the time (unit: second) for automatically disconnecting the connection when there is no response from the remote server.

Username

Specify the login ID (user name) to the remote server.

Password

Specify the password of the login ID. If you click [View], password hiding will be canceled.

SSH Key

Set the SSH key as needed.

Queue

Select the type of job scheduler running on the remote server to be connected.

Solver

Select the program to use in this profile.

You can also change it in the window.

Shell Script

Check Use Default to execute calculations using the default shell script, or Use Template to customize the shell script. If you check Use Template, select a template file to use in the pull-down menu beside it, and if you want to add, edit or delete a template file Add, Edit Remove button.

You can use Alias string available for remote job function in the template file.

The template file is saved in UserPref in Winmostar installation folder.

You can also change it in the window.

Options

Set arguments to be given after job submission command (qsub etc.).

You can use Alias string available for remote job function for this item.

You can also change it in the window.

Remote Directory

Specify the working folder of the remote server. If it is empty (Local User ID)/(program name)/(file name) is the working directory from the home directory. The Local User ID is the user name under Windows in operation and is displayed in the title of the Submit Remote Job window. If the Local User ID contains double-byte characters or half-width spaces, the directory name is internally converted to single-byte alphanumeric characters. If you enclose it with a single quotation like :command: ‘/ work/dir’, create (Local User ID)/(program name)/(file name) from the specified directory. Also, if you enclose two single quotations like ''/work/dir'', a directory with (Local User ID) will not be created.

You can use Alias string available for remote job function for this item.

Prefix for Queueing Commands

When executing commands such as qsub, if you need the prefix of those commands, set them here. Normally it is empty.

Test Connection

We will test the connection of SSH. Please note that the job scheduler will not be tested.

Connection Menu
Test Connection Using SFTP

Test SSH connection and job scheduler operation.

The same operation is possible with the Test Connection button in the window.

Share SSH Connection Once Established

Used to maintain SSH connection. If you execute it once before the operation involving SSH connection, the operation after that will be lightened.

Open Putty

Open the Putty setting window and make detailed settings for connection.

Do Not Use Putty for Connection(experimental)

Do not use Putty for Putty connection. Since the connection is maintained, the operation becomes light. When connecting with key authentication, it is also necessary to specify the public key.

Job Menu
Send Local Files & Submit Job

Generate an input file required for calculation, transfer it to the remote server by SFTP, and submit it to the job scheduler. After submitting, the ID of the job is displayed.

Generate the input files needed for the calculation, transfer them to the remote server via SFTP, and then submit them to the job scheduler. The ID of the submitted job is displayed at the bottom of the window; the ID is used to cancel (KILL) the job.

The same operation is possible with the Send & Submit button in the window.

Submit Job

It generates an input file necessary for calculation and transfers it to the remote server by SFTP.

List Files at Remote Directory

Get the list of files in Remote Directory.

The same operation is possible with the ls button in the window.

Display Remote File

Retrieve the contents of the selected file in Remote Directory.

The same operation is possible with the cat button in the window.

Display Last Part of Remote Log File

Get the end of the log file in Remote Directory.

The same operation is possible with the tail button in the window.

Search String in Remote Log File

Search strings in the log files in Remote Directory.

The same operation is possible with the grep button in the window.

Restert Terminated Job

If a remote job is forcibly interrupted by the job scheduler, etc., this function restarts the calculation.

Force Job Finalization

If all files are not generated due to abnormal termination of calculation, and Get All Remote Files does not operate normally, executing this function forcibly terminates the process, and Get All Remote Files can be executed.

Get Remote File and …

Get a specific file in Remote Directory and visualize it.

The same operation is possible with the Get File & … button in the window.

Queue menu

A concrete command name in the selected job scheduler is displayed in parentheses in each menu name.

List Submitted Jobs

Get a list of jobs registered in the job scheduler.

The same operation is possible with the button with the same command name in the window.

Kill Submitted Job

Suspends the job registered in the job scheduler. You must enter the ID of the job displayed immediately after submitting.

The same operation is possible with the button with the same command name in the window.

List Submitted Jobs in Detail

Get a detailed list of jobs registered in the job scheduler.

The same operation is possible with the button with the same command name in the window.

Show Information of Each Queue

Get the list of queues managed by the job scheduler.

The same operation is possible with the button with the same command name in the window.

Show Usage of Each Queue

Get usage status of each queue.

The same operation is possible with the button with the same command name in the window.

Show Information of All Nodes

Get information on all machines managed by Job Scheduler.

The same operation is possible with the button with the same command name in the window.

Other menu

The same command as the item name is executed on the remote server.

Options Menu
Enable Admin Mode

Used to access the remote server with root privilege

7.5. Alias string available for remote job function

Arguments of shell scripts and submit commands used when executing jobs may change dynamically depending on calculation conditions, so you can use alias strings to deal with such situations.

A list of available alias strings is shown below.

%WM_USER_ID%

Local user ID for remote directory creation

%WM_SOLVER%

Type of solver

%WM_INPUT%

Input file name

%WM_PREFIX%

Input file name minus extension

%WM_EXT%

Input file name extension

%% WM_NUM_PROC

Number of CPU (MPI) parallel number

%WM_NUM_THREAD%

Thread side by side

%WM_NUM_PARALLEL%

Product of% WM_NUM_PROC% and% WM_NUM_THREAD%

7.6. Remote job configuration file

Profile settings are saved in UserPref\winmos_profile.ini in Winmostar installation folder. When reading, in order to maintain compatibility with old versions of V8 or earlier, it is read in the following order of priority.

UserPref\winmos_profile.ini > UserPref\winmos_server.ini > wm_system\RemoteJobdefault_profile.ini

7.7. How to use Windows server

A Windows PC are available as a remote server. The following preparations are required to use it.

  • Install an OpenSSH server on the remote server so that clients can connect with SSH.

  • Install Winmostar on the remote server and always start Winmostar Job Manager.

Set as follows.

  • In the Profile edit window, select JM (Windows) for Queue.

  • Set Winmostar Path installed on the remote server to Winmostar Path.

  • Since the default shell script cannot be used, select Use Template to create a template file. The contents of the batch file that runs on Windows.

There are operational differences from other job schedulers as follows.

  • It also checks whether the job manager is running when the Test Connection button is pressed.

  • The information displayed by the List Jobs button is the same as Job Manager, from the left: number, status, priority, number of cores, job name, start date/time, end date/time, and batch file.

  • If you want to cancel the job with the Delete Job button, enter the job name.

7.8. How to connect via HTTP proxy

To connect to a remote server via SSH through an HTTP proxy server, follow the steps below.

  • Tools ‣ Cygwin to start Cygwin.

  • Enter the command as shown below. Where REMOTE_SERVER, PROXY_HOST_NAME, PROXY_PORT, and USER_NAME should be replaced by the remote server name, proxy server name, proxy port number, and user name, respectively.

ssh -L1234:REMOTE_SERVER:22 -o “ProxyCommand connect-proxy -H PROXY_HOST_NAME:PROXY_PORT %h %p” USER_NAME@REMOTE_SERVER

  • With the above connection, set Host Name to localhost and Port to 1234 in the Edit Profile screen.