[1]: Yanfei Guo, Ken Raffenetti, Hui Zhou, Pavan Balaji, Min Si, Abdelhalim Amer, Shintaro Iwasaki, Sangmin Seo, Giuseppe Congiu, Robert Latham, Lena Oden, Thomas Gillis, Rohit Zambre, Kaiming Ouyang, Charles Archer, Wesley Bland, Jithin Jose, Sayantan Sur, Hajime Fujita, Dmitry Durnov, Michael Chuvelev, Gengbin Zheng, Alex Brooks, Sagar Thapaliya, Taru Doodi, Maria Garazan, Steve Oyanagi, Marc Snir, and Rajeev Thakur. Preparing MPICH for exascale. Int. J. High Perform. Comput. Appl., 39(2):283--305, 2025. [ http ]
[2]: Hui Zhou, Ken Raffenetti, Wesley Bland, and Yanfei Guo. Generating bindings in MPICH. CoRR, abs/2401.16547, 2024. [ http ]
[3]: Hui Zhou, Robert Latham, Ken Raffenetti, Yanfei Guo, and Rajeev Thakur. MPI progress for all. In SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, Atlanta, GA, USA, November 17-22, 2024, pages 425--435. IEEE, 2024. [ http ]
[4]: Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Jinyang Liu, Yafan Huang, Ken Raffenetti, Hui Zhou, Kai Zhao, Zizhong Chen, Franck Cappello, Yanfei Guo, and Rajeev Thakur. POSTER: optimizing collective communications with error-bounded lossy compression for GPU clusters. In Michel Steuwer, I-Ting Angelina Lee, and Milind Chabbi, editors, Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2024, Edinburgh, United Kingdom, March 2-6, 2024, pages 454--456. ACM, 2024. [ http ]
[5]: Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Zhaorui Zhang, Jinyang Liu, Xiaoyi Lu, Ken Raffenetti, Hui Zhou, Kai Zhao, Zizhong Chen, Franck Cappello, Yanfei Guo, and Rajeev Thakur. An optimized error-controlled MPI collective framework integrated with lossy compression. In IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024, San Francisco, CA, USA, May 27-31, 2024, pages 752--764. IEEE, 2024. [ http ]
[6]: Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Jinyang Liu, Yafan Huang, Ken Raffenetti, Hui Zhou, Kai Zhao, Xiaoyi Lu, Zizhong Chen, Franck Cappello, Yanfei Guo, and Rajeev Thakur. gzccl: Compression-accelerated collective communication framework for GPU clusters. In Kenji Kise, Valentina Salapura, Murali Annavaram, and Ana Lucia Varbanescu, editors, Proceedings of the 38th ACM International Conference on Supercomputing, ICS 2024, Kyoto, Japan, June 4-7, 2024, pages 437--448. ACM, 2024. [ http ]
[7]: Hui Zhou, Ken Raffenetti, Junchao Zhang, Yanfei Guo, and Rajeev Thakur. Frustrated with mpi+threads? try mpixthreads! In Proceedings of the 30th European MPI Users' Group Meeting, EuroMPI 2023, Bristol, United Kingdom, September 11-13, 2023, pages 2:1--2:10. ACM, 2023. [ http ]
[8]: Thomas Gillis, Ken Raffenetti, Hui Zhou, Yanfei Guo, and Rajeev Thakur. Quantifying the performance benefits of partitioned communication in MPI. In Proceedings of the 52nd International Conference on Parallel Processing, ICPP 2023, Salt Lake City, UT, USA, August 7-10, 2023, pages 285--294. ACM, 2023. [ http ]
[9]: Jiajun Huang, Kaiming Ouyang, Yujia Zhai, Jinyang Liu, Min Si, Ken Raffenetti, Hui Zhou, Atsushi Hori, Zizhong Chen, Yanfei Guo, and Rajeev Thakur. Accelerating MPI collectives with process-in-process-based multi-object techniques. In Ali Raza Butt, Ningfang Mi, and Kyle Chard, editors, Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2023, Orlando, FL, USA, June 16-23, 2023, pages 333--334. ACM, 2023. [ http ]
[10]: Jiajun Huang, Kaiming Ouyang, Yujia Zhai, Jinyang Liu, Min Si, Ken Raffenetti, Hui Zhou, Atsushi Hori, Zizhong Chen, Yanfei Guo, and Rajeev Thakur. Pip-mcoll: Process-in-process-based multi-object MPI collectives. In IEEE International Conference on Cluster Computing, CLUSTER 2023, Santa Fe, NM, USA, October 31 - Nov. 3, 2023, pages 354--364. IEEE, 2023. [ http ]
[11]: Hui Zhou, Ken Raffenetti, Yanfei Guo, and Rajeev Thakur. MPIX stream: An explicit solution to hybrid MPI+X programming. In EuroMPI/USA'22: 29th European MPI Users' Group Meeting, Chattanooga, TN, USA, September 26 - 28, 2022, pages 1--10. ACM, 2022. [ http ]
[12]: Abdelhalim Amer, Charles Archer, Michael Blocksome, Chongxiao Cao, Michael Chuvelev, Hajime Fujita, Maria Garzaran, Yanfei Guo, Jeff R. Hammond, Shintaro Iwasaki, Kenneth J. Raffenetti, Mikhail Shiryaev, Min Si, Kenjiro Taura, Sagar Thapaliya, and Pavan Balaji. Software combining to mitigate multithreaded MPI contention. In Rudolf Eigenmann, Chen Ding, and Sally A. McKee, editors, Proceedings of the ACM International Conference on Supercomputing, ICS 2019, Phoenix, AZ, USA, June 26-28, 2019, pages 367--379. ACM, 2019. [ http ]
[13]: Ken Raffenetti, Abdelhalim Amer, Lena Oden, Charles Archer, Wesley Bland, Hajime Fujita, Yanfei Guo, Tomislav Janjusic, Dmitry Durnov, Michael Blocksome, Min Si, Sangmin Seo, Akhil Langer, Gengbin Zheng, Masamichi Takagi, Paul K. Coffman, Jithin Jose, Sayantan Sur, Alexander Sannikov, Sergey Oblomov, Michael Chuvelev, Masayuki Hatanaka, Xin Zhao, Paul F. Fischer, Thilina Rathnayake, Matthew Otten, Misun Min, and Pavan Balaji. Why is MPI so slow?: analyzing the fundamental limits in implementing MPI-3.1. In Bernd Mohr and Padma Raghavan, editors, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO, USA, November 12 - 17, 2017, page 62. ACM, 2017. [ http ]
[14]: Yanfei Guo, Charles J. Archer, Michael Blocksome, Scott Parker, Wesley Bland, Ken Raffenetti, and Pavan Balaji. Memory compression techniques for network address management in MPI. In 2017 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2017, Orlando, FL, USA, May 29 - June 2, 2017, pages 1008--1017. IEEE Computer Society, 2017. [ http ]
[15]: Ken Raffenetti, Antonio J. Peña, and Pavan Balaji. Toward implementing robust support for portals 4 networks in MPICH. In 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2015, Shenzhen, China, May 4-7, 2015, pages 1173--1176. IEEE Computer Society, 2015. [ http ]
[16]: Wesley Bland, Kenneth Raffenetti, and Pavan Balaji. Simplifying the recovery model of user-level failure mitigation. In Proceedings of the 2014 Workshop on Exascale MPI, ExaMPI '14, New Orleans, Louisiana, USA, November 16-21, 2014, pages 20--25. IEEE, 2014. [ http ]
[17]: Junchao Zhang, Bill Long, Kenneth Raffenetti, and Pavan Balaji. Implementing the MPI-3.0 fortran 2008 binding. In Jack J. Dongarra, Yutaka Ishikawa, and Atsushi Hori, editors, 21st European MPI Users' Group Meeting, EuroMPI/ASIA '14, Kyoto, Japan - September 09 - 12, 2014, page 1. ACM, 2014. [ http ]