Basic Magnetic Robotics Multi-Agent Environment#

class magbotsim.rl_envs.basic_multi_agent_env.BasicMagBotMultiAgentEnv(layout_tiles: ndarray, num_movers: int, tile_params: dict[str, Any] | None = None, mover_params: dict[str, Any] | None = None, initial_mover_zpos: float = 0.005, table_height: float = 0.4, std_noise: ndarray | float = 1e-05, render_mode: str | None = 'human', default_cam_config: dict[str, Any] | None = None, width_no_camera_specified: int = 1240, height_no_camera_specified: int = 1080, collision_params: dict[str, Any] | None = None, initial_mover_start_xy_pos: ndarray | None = None, initial_mover_goal_xy_pos: ndarray | None = None, custom_model_xml_strings: dict[str, str] | None = None, use_mj_passive_viewer: bool = False)[source]

Bases: BasicMagBotEnv, ParallelEnv

A base class for multi-agent reinforcement learning environments in the field of Magnetic Robotics that follow the PettingZoo API. A more detailed explanation of all parameters can be found in the documentation of the BasicMagneticRoboticsEnv.

Parameters:

layout_tiles – the tile layout
num_movers – the number of movers
tile_params – tile parameters such as the size and mass, defaults to None
mover_params – mover parameters such as the size and mass, defaults to None
initial_mover_zpos – the initial distance between the bottom of the mover and the top of a tile, defaults to 0.005 [m]
table_height – the height of a table on which the tiles are placed, defaults to 0.4 [m]
std_noise – the standard deviation of a Gaussian with zero mean used to add noise, defaults to 1e-5
render_mode – the mode that is used to render the frames (‘human’, ‘rgb_array’ or None), defaults to ‘human’
default_cam_config – dictionary with attribute values of the viewer’s default camera, https://mujoco.readthedocs.io/en/latest/XMLreference.html?highlight=camera#visual-global, defaults to None
width_no_camera_specified – if render_mode != ‘human’ and no width is specified, this value is used, defaults to 1240
height_no_camera_specified – if render_mode != ‘human’ and no height is specified, this value is used, defaults to 1080
collision_params – a dictionary that can be used to specify collision parameters, defaults to None
initial_mover_start_xy_pos – the initial (x,y) starting positions of the movers, defaults to None
initial_mover_goal_xy_pos – the initial (x,y) goal positions of the movers, defaults to None
custom_xml_strings –
a dictionary containing additional XML strings to provide the ability to add actuators, sensors, objects, robots, etc. to the model. The keys determine where to add a string in the XML structure and the values contain the XML string to add. The following keys are accepted:
- custom_compiler_xml_str:
  A custom ‘compiler’ XML section. Note that the entire default ‘compiler’ section is replaced.
- custom_visual_xml_str:
  A custom ‘visual’ XML section. Note that the entire default ‘visual’ section is replaced.
- custom_option_xml_str:
  A custom ‘option’ XML section. Note that the entire default ‘option’ section is replaced.
- custom_assets_xml_str:
  This XML string adds elements to the ‘asset’ section.
- custom_default_xml_str:
  This XML string adds elements to the ‘default’ section.
- custom_worldbody_xml_str:
  This XML string adds elements to the ‘worldbody’ section.
- custom_contact_xml_str:
  This XML string adds elements to the ‘contact’ section.
- custom_outworldbody_xml_str:
  This XML string should be used to include files or add sections.
- custom_mover_body_xml_str_list:
  This list of XML strings should be used to attach objects to a mover. Note that this a list with length num_movers. If nothing is attached to a mover, add None at the corresponding mover index.
If set to None, only the basic XML string is generated, containing tiles, movers (excluding actuators), and possibly goals; defaults to None. This dictionary can be further modified using the _custom_xml_string_callback().
use_mj_passive_viewer – whether the MuJoCo passive_viewer should be used, defaults to False. If set to False, the Gymnasium MuJoCo WindowViewer with custom overlays is used.

_custom_xml_string_callback(custom_model_xml_strings: dict[str, str] | None = None) → dict[str, str] | None

A callback that should be used to add further functionality to the __init__() method. This callback should be used to modify the custom XML string in the custom_model_xml_strings dictionary after the tile, mover and collision parameters have been preprocessed and checked, but before the MuJoCo model XML string is generated. This allows adding custom XML strings based on the tile or mover configuration, e.g. to add actuators for each mover.

Parameters:: custom_model_xml_strings – a dictionary containing additional XML strings to provide the ability to add actuators, sensors, objects, robots, etc. to the model., defaults to None (see documentation of the __init__() method for more detailed information). Note that this dictionary may be modified within this method.
Returns:: the possibly modified dictionary with additional XML strings

_render_callback() → None: A callback that should be used to add further functionality to the render() method (see documentation of the render() method for more information about when the callback is called).

action_space(agent: AgentID) → Space

Takes in agent and returns the action space for that agent.

MUST return the same value for the same agent name

Default implementation is to return the action_spaces dict

check_mover_collision(mover_names: list[str], c_size: float | ndarray, add_safety_offset: bool = False, mover_qpos: ndarray | None = None, add_qpos_noise: bool = False) → bool

Check whether two movers specified in mover_names collide. In case of collision shape ‘box’, this method takes the orientation of the movers into account.

Parameters:

mover_names – a list of mover names that should be checked (correspond to the body name of the mover in the MuJoCo model)
c_size –
the size of the collision shape of the movers
- collision_shape = ‘circle’:
  use a single float value to specify the same size for all movers and a numpy array of shape (num_movers,) to specify individual sizes for each mover
- collision_shape = ‘box’:
  use a numpy array of shape (2,) to specify the same size for all movers and a numpy array of shape (num_movers,2) to specify individual sizes for each mover
add_safety_offset – whether to add the size offset (can be specified using: collision_params[“offset”]), defaults to False. Note that the same size offset is added for both movers.
mover_qpos – the qpos of the movers specified as a numpy array of shape (num_movers,7) (x_p,y_p,z_p,w_o,x_o,y_o,z_o). If set to None, the current qpos of the movers in the MuJoCo model is used; defaults to None
add_qpos_noise – whether to add Gaussian noise to the qpos of the movers, defaults to False. Only used if mover_qpos is not None.

Returns:

True if the movers collide, False otherwise

check_wall_collision(mover_names: list[str], c_size: float | ndarray, add_safety_offset: bool = False, mover_qpos: ndarray | None = None, add_qpos_noise: bool = False) → ndarray

Check whether the qpos of the movers listed in mover_names are valid, i.e. no wall collisions.

Parameters:

mover_names – a list of mover names that should be checked (correspond to the body name of the mover in the MuJoCo model)
c_size –
the size of the collision shape
- collision_shape = ‘circle’:
  use a single float value to specify the same size for all movers and a numpy array of shape (num_movers,) to specify individual sizes for each mover
- collision_shape = ‘box’:
  use a numpy array of shape (2,) to specify the same size for all movers and a numpy array of shape (num_movers,2) to specify individual sizes for each mover
add_safety_offset – whether to add the size offset (can be specified using: collision_params[“offset”]), defaults to False. Note that the same size offset is added for all movers.
mover_qpos – a numpy array of shape (num_qpos,7) containing the qpos (x_p,y_p,z_p,w_o,x_o,y_o,z_o) of each mover or None. If set to None, the current qpos of each mover in the MuJoCo model is used; defaults to None
add_qpos_noise – whether to add Gaussian noise to the qpos of the movers, defaults to False. Only used if mover_qpos is not None.

Returns:

a numpy array of shape (num_movers,), where an element is 1 if the qpos is valid (no wall collision), otherwise 0

close() → None: Close the environment.

generate_model_xml_string(mover_start_xy_pos: ndarray | None = None, mover_goal_xy_pos: ndarray | None = None, custom_xml_strings: dict[str, str] | None = None) → str

Generate a MuJoCo model XML string based on the mover-tile configuration of the environment.

Parameters:

mover_start_xy_pos – a numpy array of shape (num_movers,2) containing the (x,y) starting positions of each mover. If set to None, the movers will be placed in the center of a tile, i.e. the number of tiles must be >= the number of movers; defaults to None.
mover_goal_xy_pos – a numpy array of shape (num_movers_with_goals,2) containing the (x,y) goal positions of the movers (num_movers_with_goals <= num_movers). Note that only the first 6 movers have different colors to make the movers clearly distinguishable. Movers without goals are shown in gray. If set to None, no goals will be displayed and all movers are colored in gray; defaults to None
custom_xml_strings –
a dictionary containing additional XML strings to provide the ability to add actuators, sensors, objects, robots, etc. to the model. The keys determine where to add a string in the XML structure and the values contain the XML string to add. The following keys are accepted:
- custom_compiler_xml_str:
  A custom ‘compiler’ XML section. Note that the entire default ‘compiler’ section is replaced.
- custom_visual_xml_str:
  A custom ‘visual’ XML section. Note that the entire default ‘visual’ section is replaced.
- custom_option_xml_str:
  A custom ‘option’ XML section. Note that the entire default ‘option’ section is replaced.
- custom_assets_xml_str:
  This XML string adds elements to the ‘asset’ section.
- custom_default_xml_str:
  This XML string adds elements to the ‘default’ section.
- custom_worldbody_xml_str:
  This XML string adds elements to the ‘worldbody’ section.
- custom_contact_xml_str:
  This XML string adds elements to the ‘contact’ section.
- custom_outworldbody_xml_str:
  This XML string should be used to include files or add sections.
- custom_mover_body_xml_str_list:
  This list of XML strings should be used to attach objects to a mover. Note that this a list with length num_movers. If nothing is attached to a mover, add None at the corresponding mover index.
If set to None, only the basic XML string is generated, containing tiles, movers (excluding actuators), and possibly goals; defaults to None

Returns:

MuJoCo model XML string

get_c_size_arr(c_size: float | ndarray, num_reps: int) → ndarray

Return the size of the collision shape as a numpy array of shape (num_reps,1) or (num_reps,2) depending on the collision shape. This method should be used to obtain the appropriate c_size_arr if the same size is to be used for all movers.

Parameters:

c_size –
the size of the collision shape:
- collision_shape = ‘circle’:
  use a single float value to specify the same size for all movers and a numpy array of shape (num_reps,) to specify individual sizes for each mover
- collision_shape = ‘box’:
  use a numpy array of shape (2,) to specify the same size for all movers and a numpy array of shape (num_reps,2) to specify individual sizes for each mover
num_reps – the number of repetitions of c_size if the same size of collision shape is to be used for all movers. Otherwise, this value is ignored.

Returns:

the collision shape sizes as a numpy array of a suitable shape:

collision_shape = ‘circle’:
a numpy array of shape (num_reps,1)
collision_shape = ‘box’:
a numpy array of shape (num_reps,2) if c_size is a numpy array of shape (2,). Otherwise, c_size is not modified.

get_mover_qacc(mover_names: str | list[str], add_noise: bool = False) → ndarray

Return the qacc of several movers as a numpy array of shape (num_movers,6).

Parameters:

mover_names – a single mover name or a list of mover names for which the qacc should be returned (correspond to the body name of the mover in the MuJoCo model)
add_noise – whether to add Gaussian noise to the qacc of the movers, defaults to False

Returns:

a numpy array of shape (num_movers,6) containing the qacc (x,y,z,a,b,c) of each mover. The order of the qacc corresponds to the order of the mover names.

get_mover_qpos(mover_names: str | list[str], add_noise: bool = False) → ndarray

Return the qpos of several movers as a numpy array of shape (num_movers,7).

Parameters:

mover_names – a single mover name or a list of mover names for which the qpos should be returned (correspond to the body name of the mover in the MuJoCo model)
add_noise – whether to add Gaussian noise to the qpos of the movers, defaults to False

Returns:

a numpy array of shape (num_movers,7) containing the qpos (x_p,y_p,z_p,w_o,x_o,y_o,z_o) of each mover. The order of the qpos corresponds to the order of the mover names.

get_mover_qvel(mover_names: str | list[str], add_noise: bool = False) → ndarray

Return the qvel of several movers as a numpy array of shape (num_movers,6).

Parameters:

mover_names – a single mover name or a list of mover names for which the qvel should be returned (correspond to the body name of the mover in the MuJoCo model)
add_noise – whether to add Gaussian noise to the qvel of the movers, defaults to False

Returns:

a numpy array of shape (num_movers,6) containing the qvel (x,y,z,a,b,c) of each mover. The order of the qvel corresponds to the order of the mover names.

get_tile_indices_mask(mask: ndarray) → tuple[ndarray, ndarray]

Find the x and y indices of tiles that correspond to the specified structure (the mask) in the tile layout. Note that the indices of the top left tile in the mask are returned.

Parameters:: mask – a 2D numpy array containing only 0 and 1 which specifies the structure to be found in the tile layout
Returns:: the x and y indices of the tiles in separate numpy arrays, each of shape (num_mask_found,)

get_tile_xy_pos() → tuple[ndarray, ndarray]

Find the (x,y)-positions of the tiles. The position of a tile in the tile layout with index (i_x,i_y), can be found using (x-pos[i_x,i_y], y-pos[i_x,i_y]), where x-pos and y-pos are returned by this method. Note that the base frame is in the upper left corner.

Returns:: the x and y positions of the tiles in separate numpy arrays, each of shape (num_tiles_x, num_tiles_y)

observation_space(agent: AgentID) → Space

Takes in agent and returns the observation space for that agent.

MUST return the same value for the same agent name

Default implementation is to return the observation_spaces dict

qpos_is_valid(qpos: ndarray, c_size: float | ndarray, add_safety_offset: bool = False) → ndarray

Check whether qpos is valid. This method considers the edges as imaginary walls if there is no other tile next to that edge. A position is valid if it is above a tile and the distance to the walls is greater that the required safety margin, i.e. no collision with a wall. This also ensures that the position is reachable in case the specified position is a goal position.

This method allows to check multiple qpos at the same time, where the movers can be of different sizes. The orientation of the mover is taken into account if collision_shape = ‘box’, otherwise (collision_shape = ‘circle’) the orientation of the mover is ignored.

Parameters:

qpos – a numpy array of shape (num_qpos,7) containing the qpos (x_p,y_p,z_p,w_o,x_o,y_o,z_o) to be checked
c_size –
the size of the collision shape
- collision_shape = ‘circle’:
  use a single float value to specify the same size for all movers and a numpy array of shape (num_qpos,) to specify individual sizes for each mover
- collision_shape = ‘box’:
  use a numpy array of shape (2,) to specify the same size for all movers and a numpy array of shape (num_qpos,2) to specify individual sizes for each mover
add_safety_offset – whether to add the size offset (can be specified using: collision_params[“offset”]), defaults to False. Note that the same size offset is added for all movers.

Returns:

a numpy array of shape (num_qpos,), where an element is 1 if the qpos is valid, otherwise 0

render() → ndarray | None

Compute frames depending on the initially specified render_mode. Before the corresponding viewer is updated, the _render_callback() is called to give the opportunity to add more functionality.

Returns:: returns a numpy array if render_mode != ‘human’, otherwise it returns None (render_mode ‘human’)

reset(seed: int | None = None, options: dict | None = None) → tuple[dict[AgentID, ObsType], dict[AgentID, dict]]

Resets the environment.

And returns a dictionary of observations (keyed by the agent name)

state() → ndarray

Returns the state.

State returns a global view of the environment appropriate for centralized training decentralized execution methods like QMIX

step(actions: dict[AgentID, ActionType]) → tuple[dict[AgentID, ObsType], dict[AgentID, float], dict[AgentID, bool], dict[AgentID, bool], dict[AgentID, dict]]

Receives a dictionary of actions keyed by the agent name.

Returns the observation dictionary, reward dictionary, terminated dictionary, truncated dictionary and info dictionary, where each dictionary is keyed by the agent.

update_cached_mover_mujoco_data() → None: Update all cached information about MuJoCo objects, such as mover names, mover joint names, mover goal site names, etc.

window_viewer_is_running() → bool

Check whether the window viewer (render_mode ‘human’) is active, i.e. the window is open.

Returns:: True if the window is open, False otherwise